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Network  Coding  for  Network  Security 


Objective 

Use  network  coding  to  enable  greater 
robustness  and  security 

•  Reduce  vulnerability  eavesdroppers  in  networks 

•  Provide  reliability  to  Byzantine  nodes  in  changing 
conditions 

•  Provide  constructive  means  of  creating  schemes  that 
are  as  efficient  as  traditional  point-to-point  coding 
schemes 

1  9  6b»c»4g  1 
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Number  of  symbols  that  an  intermediate  node 
has  to  guess  in  order  to  decode  one  of  the  symbols 

Scientific/Technical  Approach 

•  Use  the  algebraic  linear  mixing  of  data  to  allow 
intrinsic  keys  from  other  data  by  considering  the 
diagonalizability  of  matrices 

•  Since  robustness  using  network  coding  depends  on 
having  sufficient  degrees  of  freedom  to  counteract 
attackers  over  the  entire  network,  we  develop  means 
of  tracking  topology  in  changing  P2P  networks 
•Use  network  coding  for  constructing  codes  that 
match  Singleton  bound  even  with  unknown  attackers 

Accomplishments 

•  New  algorithms  that  use  network  coding  for: 

•  data  hiding  without  the  use  of  a  key  - 
ensuring  sufficient  degrees  of  freedom  to 
decode  over  at  the  receiver  in  variable  settings 

•  creating  efficient  coding  schemes  for 
Byzantine  attacks 

•  providing  quantification  of  the  benefits  of 
network  coding 

Challenges 

•Integrating  protection,  degree  of  freedom  design 
and  coding  . 

Key  Accomplishments 


Technical  breakthroughs: 

—  Demonstrated  the  use  of  network  coding  to  provide  intrinsic  cryptographic  protection  for 
wiretapped  networks 

—  Provided  new  means  of  using  network  coding  for  networks  under  attack: 

•  For  distributed  network  coded  storage  networks  (peer-to-peer),  a  method  for  tracking  the 
evolving  topology  of  a  peer-to-peer  network  so  as  to  ensure  sufficient  coded  diversity 
against  attackers 

•  For  general  networks,  a  robust  coding  approach  that  matches  the  Singleton  bound  even 
under  attack  scenarios  for  unknown  attack  locations  as  long  as  a  level  of  diversity  is 
ensured 

•  We  show  that  random  network  coding  provides  better  reliability  than  random  dispersive 
routing  if  there  is  enough  capacity  in  the  network 

The  support  of  AFOSR  in  this  context  is  crucial: 

—  Only  program  to  our  knowledge  that  considers  security  of  network  coding  in  wireline  systems, 
including  P2P 

—  Deployment  of  network  coded  P2P  systems  is  taking  place  commercially  (Microsoft)  and  holds 
great  promise  for  military  applications, 

—  Collaboration  with  industry  (HP)  for  technology  transfer  and  synergies  with  DARPA  IAMANET 
program  (which  is  focused  entirely  on  MANETs  but  can  leverage  some  aspects  of  this 
program),  collaborations  with  general  theoretical  underpinnings  for  network  coding  through  NSF 
program  and  European  program 


Content  distribution  using  network  coding 


Peer  C 


Network  coding  operates  by  allowing  mixing  of  data 
What  are  the  security  consequences  of  such  mixtures? 

Two  aspects: 

—  Wiretapping  aspects 

—  Byzantine  or  pollution  attacks  -  detection  and  correction  A  malicious  user 
sends  packets  with  valid  linear  combination  in  header,  but  garbage  payload 


Wiretapping  aspects 


The  mixture  of  two  messages,  appropriately  compressed,  makes  one  message  a  one-time 
pad  to  another  [CY02] 

If  we  want  such  mixtures,  one  can  derive  limits  on  network  capacity  [FMSS04] 

In  general 

—  Difficult  to  know  the  maximum  number  of  links  that  can  be  tapped  by  adversary 

—  Such  secure  coding  schemes  are  sometimes  impossible 
Main  scheme: 

—  Use  other  messages  for  “encryption” 

—  If  no  other  messages  are  sent:  identical  to  dispersive  routing  [JM04,  LLF04] 

—  If  other  messages  are  sent:  extra  security  from  network  coding 

Security  can  be  added  via  network  coding  with  lower  cost  than  dispersive  routing  [TM06] 

Define  level  of  security  provided  by  random  linear  network  coding  is  measured  by  the 
number  of  symbols  that  an  intermediate  node  has  to  guess  in  order  to  decode  one  of  the 
transmitted  symbols  [LMB07],  [LVMB08] 


Random  linear  coding  -  a  free  cypher? 


Overview: 

•  Random  linear  coding  (RLC)  in  effect  provides  a  one-time 
use  pad  use  of  data  in  combination 

•  Level  of  security  provided  by  RLC: 

—  Number  of  symbols  that  an  intermediate  node  v  has  to 
guess  in  order  to  decode  one  of  the  transmitted 
symbols 

—  Partial  transfer  matrix 

•  We  consider  these  results  under  different  topologies 

Model,  definitions,  approach  and  results; 
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Two  cases  w/  relevant  information: 

1 .  Partial  transfer  matrix  has  full  rank 

2.  Partial  transfer  matrix  has  diagonizable  parts 

Linear  combination  of  independent  and  uniformly  distributed 
values  in  Fq 

Product  -  Obtain  a  zero: 2 q  zeros,  (a  (E  F  )  x  0 
q  entries  of  the  multiplicative  table 
Probability  p  of  having  &  K  - 1  zeros  in  one  or  more  lines 


P(X,m  =()),_  =0 


Random  linear  coding  -  a  free  cypher? 


Analysing  the  different  possibilities  of 
combinations  for  the  lines  that  already 
have  (K-l)  zeros  and  the  ones  that  can 
be  obtained  by  Gaussian  elimination 

recoverable  number  of  symbols 

<57(v)  degrees  of  freedom 


1  <6,(v)  <  AT 


L  =  d,(v)-l 
Lines  to  perform 
Gaussian  elimination 


Byzantine  and  pollution  attacks 


Robustness  against  faulty/malicious  components  with  arbitrary 
behavior,  e.g. 

—  dropping  packets 

—  misdirecting  packets 

—  sending  spurious  information 

Abstraction  as  Byzantine  generals  problem  [LSP82] 

Byzantine  robustness  in  networking  [P88,MR97,KMM98,CL99] 


Problem  setup 


•  Random  linear  network  coding  using  coding  vectors. 

•  A  batch  of  r  packets  is  multicast  from  a  source  node  s  to  a  set  of  sink  nodes. 

•  A  packet  that  is  not  a  linear  combination  of  its  input  packets  is  called 
adversarial. 

•  Zq  —  the  maximum  number  of  adversarial  packets 

•  m-  the  minimum  source-sink  cut  capacity 

•  p  -  proportion  of  redundant  symbols  in  each  packet 

•  An  omniscient  adversary  can  observe  transmission  on  the  entire  network 

•  Main  results: 

—  If  the  adversary  is  omniscient,  the  information  rate  of  the  code  approaches 
m-2z0  asymptotically  as  the  packet  size  increases. 

—  If  the  adversary  is  NOT  omniscient,  and  the  source  and  the  sinks  share  a 
secret  channel  not  observed  by  the  adversary,  a  rate  of  m-z0  is 
asymptotically  achievable. 

—  Will  give  details  for  the  omniscient  adversary  case. 


Byzantine  and  pollution  attacks  - 
correction  at  decoding  time 

•  Distributed  randomized  network  coding  can  be  extended  to  detect  Byzantine 
behavior 

—  Small  computational  and  communication  overhead 

—  small  number  of  hash  bits  included  with  each  packet,  calculated  as  simple 
polynomial  function  of  data 

•  Require  only  that  a  Byzantine  attacker  does  not  design  and  supply  modified 
packets  with  complete  knowledge  of  other  nodes’  packets 

•  Main  scheme: 

—  Use  a  polynomial  hash 

—  An  attacker  without  full  knowledge  of  the  traffic  will  have  low  probability  of 
being  able  to  match  the  hash 

—  The  hash  can  be  used  to  detect  an  attack  [HLKMEK04] 

•  One  can  further  use  such  a  hash  to  decode 
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Omniscient  adversary  case 


Input  matrix,  X,  whose  /th  row,  x„  corresponds  to  the  /th  input  packet. 

—  The  first  n-pn-r  entries  of  x,  are  independent  exogenous  data  symbols. 

—  The  next  pn  are  redundant  symbols. 

—  The  last  r  symbols  form  the  coding  vector. 

An  adversarial  packet  can  be  viewed  as  an  additional  source  packet,  and  Zis  the 
matrix  whose  /th  row  is  the  /th  adversarial  packet. 

The  received  packets  at  a  terminal  node  can  be  represented  by  Y ,  given  by 

Y=  GX  +  KZ 

where  G  and  K  are  the  linear  mappings  from  the  source  and  the  adversarial  packets 
respectively  to  the  sink. 

Let  G’  be  the  last  r  columns  of  Y. 

The  sink  knows  G  but  not  G. 


Omniscient  adversary  case 


Lemma  1: 

With  probability  at  least  1  -q/q,  the  matrix  G’  has  full  column  rank,  where  q  is 
the  number  of  links  in  the  network,  and  q  is  the  size  of  the  finite  field. 


Proposition  1: 

With  probability  greater  than  1  -qn£,  the  input  matrix  X  can 
be  recovered,  and  the  decoding  algorithm  has 
complexity  0(n3m3). 


Byzantine  correction 

Block-length  n  over  finite  field  Fa  Di_Tii1  U  +Ti(2)-r+i__+Ti(n( 1  _  £))'rn(1  E)b 


N 
LU 


J  |  X|E|-|Z| 


Vandermonde 

matrix 


SymboT 
from  Fq  n(1-e) 


Dj=Tj(1)M+Tj(2)’.r+...+Ti(n(1-  If  so,  accept  Tj,  else  reject  Tj 

Use  accepted  TjS  to  decode  [Jaggi05],  [JLHE05],  [JKLHDKM07] 
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Network  error  correcting  codes  for  adversarial 
errors  in  multiple  source  networks 

Overview: 

•  Network  error  correcting  codes  allow  reliable  transmission  over  a  network  that  is 
subject  to  adversarial  errors  [KK07,  08] 

•  Existing  work  gives  bounds  and  explicit  code  constructions  for  single-source 
multicast  networks 

•  We  generalize  these  results  to  multiple  source  multicast  networks 

Model  and  definitions: 

•  We  consider  a  directed  graph  G  with  a  set  U of  source  nodes,  and  an  adversary 
that  can  introduce  arbitrary  errors  on  up  to  z  links 

•  The  region  of  reliable  multicast  transmission  rates  /c,from  the  ith source  to  the  sinks 
is  given  in  terms  of  the  minimum  cut  capacities  ms  between  sources  in  subsets  S 
of  L/and  each  sink 

Approach  and  results: 

•  The  reliable  communication  rates  under  z  adversarial  errors  satisfy 

-  Singleton  bound:^ *ms-2z,vscu 

—  Hamming  bound :y A:,.  ^  ms-\ogq^'  o(^s)q-\yyscu 


Network  error  correcting  codes  -  keeping 
enough  degrees  of  freedom  around 

Overview: 

•  Determining  the  level  of  diversity  against  pollution  is 
crucial  in  ensuring  the  operation  of  coding  against 
attackers  [LBK08] 

t 

Model,  definitions  and  approach:  , 

•  In  order  to  be  able  to  model  accurately  the  topology  of  a(Q 
peer-to-peer  distributed  network  with  network  coding  w# 
introduce  the  following: 


v2, v3f  V 


—  Evolving  overlay  network:  Scale-free  random  grap' 

—  Three  types  of  nodes:  data  source,  data  collector, 
data  keeper 

—  A  tracker  keeps  a  record  of  all  nodes  that  store 
packets 

—  Each  keeper  connects  to  a  positive  number  of 
nodes  in  order  to  create  diversity  in  the  linear 
independence  of  packets  in  the  network 


M 

* 

—  _  —  —  —  —  —  —  —  —  —  —  —  —  — 

✓  N 


Information  contact  graph  evolving 
through  time. 


Verification  for  content  distribution 

without  decoding 

We  may  want  is  a  means  of  detecting  that  a  single  packet  is  polluted  without  decoding 

Need  a  homomorphic  signature  scheme  that  allows  nodes  to  verify  any  linear 
combination  of  pieces  without  contacting  the  original  sender  or  decoding  packets 

Can  use  homomorphic  hash  functions  [ADMK05],  [GR06] 

Can  use  Secure  Random  Checksum  (SRC)  which  requires  less  computation  than  the 
homomorphic  hash  function,  but  requires  a  secure  channel  to  all  the  nodes  [KFM06] 

A  signature  scheme  without  a  secure  channel  for  transmitting  hash  values  and 
associated  digital  signatures  of  received  and  transmitted  blocks 

—  Weil  pairing  on  elliptic  curves  provides  authentication  of  the  data  in  addition  to 
pollution  [CJL06] 

—  Use  a  scheme  that  relies  on  the  network  coding  scheme  intrinsically  [ZKMH07] 

We  take  a  novel  approach  that  uses  a  more  algebraic  angle  [ZKMH07],  [HHKMZ07] 

It  can  be  shown  that  it  is  as  hard  as  the  (p,  m,  m+n)  Diffie-Hellman  problem 
Overheads 

—  Part  of  the  public  key  has  to  be  re-generated  for  each  file 

—  Signature  vector 

If  the  file  sizes  are  larae,  after  the  initial  setup,  each  additional  file  distributed  only  incurs 
a  negligible  amount  of  overhead  using  our  signature  scheme 

Our  signature  scheme  has  to  be  applied  on  the  original  file,  not  on  hashes 


Conclusions 


This  program  has  provided  us  the  ability  to  develop  means  of: 

—  Establishing  new  means  of  data  protection  using  network 
coding 

—  Constructing  families  of  codes  that  are  near-optimal 
theoretically  to  recover  from  Byzantine  attacks  without 
locating  them 

—  Creating  a  means  for  verifying  validity  of  data  without 
decoding  or  using  a  trusted  authority 

—  Creating  a  means  of  tracking  reliability  of  network  under 
network  coding 


Networks  with  random  erasures  and  adversarial 

Overview:  errors 

•  Network  codes  offer  useful  error  and  erasure  correction 
capabilities,  but  can  also  suffer  from  error  propagation 

•  The  extent  and  manner  in  which  network  coding  should  be  applied 
is  shown  to  depend  on  the  network  topology  and  the  probability 
distribution  of  erasures  and  errors 

Model: 


Packet  transmissions  in  the  network  are  randomly  subject  to 
erasures  and  adversarial  or  arbitrary  errors,  with  probabilities  p  and 
q  respectively 


•  We  compare  different  coding  and  routing  strategies  for  transmitting 
over  multiple  source-sink  paths 

Approach  and  results: 

•  We  show  that  when  the  source  performs  random  coding  ,the 
problem  can  be  reduced  to  optimization  of  the  strategy  used  on 
each  path 

•  We  define  a  quantity  called  information  rank  loss  which  can  be 
used  as  a  proxy  for  probability  of  successful  decoding  in  the 
optimization  problem  (minimize  information  rank  loss) 

•  We  find  that  random  coding  becomes  more  beneficial  relative  to 
routing  as  the  redundancy  (minimum  cut  capacity  C-  source 
information  rate  R)  increases,  and  as  q  decreases  relative  to  p  (as 
shown  in  the  graphs  for  2  equal  paths  with  p=0. 1,  C=20) 

•  We  can  also  optimize  the  trade-off  between  coding  across  paths 
and  coding  among  packets  transmitted  on  a  path 


i 
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Network  Coding  for  Network  Security 


Objective 

Use  network  coding  to  enable  greater 

Robustness  and  security 

•  Use  network  coding  for  detection  and  correction 
of  Byzantine  attackers 

•  Provide  network-coding  based  verification 
systems  without  the  need  for  a  trusted  authority 

•  Use  network  coding  for  providing  free  cyphers 
in  the  network 

tT-r*-  PI 

/—rfixA1+d2xA2**0  M  □ 

^  Peer  Peer~ 

Lllu  *  Peer  D 

Peer  C 

Verification  can  occur  without  trusted  authority 

Scientific/Technical  Approach 

•  Use  the  algebraic  linear  mixing  of  data  to  detect 
Byzantine  attackers  through  the  use  of  polynomial 
hashes 

•  Extend  MDS-style  codes  in  conjunction  with 
hash-based  majority  voting  scheme 

•  Generalize  coding  bounds  by  use  of  q-Johnson 
scheme,  akin  to  the  Grassmannian  manifold  approach 
in  continuous  cases 

•  Use  discrete-log  approach  for  verification  in  a  manner 
that  is  robust  to  linear  operations 

•  Use  data  mixture  as  one-time  pad  in  the  network 

Accomplishments 

•  New  algorithms  for  detection  and  correction 
of  attacks  in  the  context  of  users  with  shared  secrets, 
omniscient  adversaries  and  limited  adversaries 

•  New  theoretical  basis  for  the  study  of  errors  aiv 
erasure  based  on  q-Johnson  scheme 

•  A  new  algorithm  for  secure  network-coding 
based  peer-to-peer  file  exchanges  based  on  ne/ 
signature  schemes,  developed  with  H-P  Laborat 

Challenges 

•Applying  our  verification  approach  on  hash  function  * 
of  the  data. 

Impact  and  outreach 


The  support  of  AFOSR  is  important  since  this  topic  is  directly  rooted 
in  adversarial  settings.  As  such,  namely  as  an  investigation  of 
information  dissemination  robustness  using  network  coding  in  a 
hostile  setting,  this  topic  is  identified  as  natural  research  in 
military  contexts 

■  We  believe  that  demonstrating  the  robustness  of  network 
coding  when  it  is  combined  with  error  correction  in  a  natural  way, 
will  be  an  enabling  factor  for  the  application  of  network  coding  in 
highly  volatile,  hostile  scenarios.  Moreover,  the  techniques  are 
applicable  also  in  non-adversary  network  contexts,  which  promises 
to  have  a  high  impact  factor. 

■  Collaboration  with  Dina  Katabi,  Sachin  Katti  (MIT  CSAIL),  Sid  Jaggi 
(Chinese  University  of  Hong  Kong),  Michael  Lanberg  (The  Open" 
University  of  Israel),  Michelle  Effros  (Caltech),  Ton  Kalker  (HP  Labs) 
-  Commercial  impact  and  synergistic  collaboration  with  DARPA 
ITMANET  and  CBMANET  projects  for  transitioning  ideas  to 
MANETs 


Byzantine  and  pollution  attacks 

■  Robustness  against  faulty/malicious 
components  with  arbitrary  behavior,  e.g. 

□  dropping  packets 

□  misdirecting  packets 

□  sending  spurious  information 

■  Abstraction  as  Byzantine  generals  problem 
[LSP82] 

■  Byzantine  robustness  in  networking 
[P88,MR97,KMM98,CL99] 


Byzantine  and  pollution  attacks 


Distributed  randomized  network  coding  can  be  extended  to 
detect  Byzantine  behavior 

□  Small  computational  and  communication  overhead 

□  Small  number  of  hash  bits  included  with  each  packet,  calculated 
as  simple  polynomial  function  of  data 

■  Require  only  that  a  Byzantine  attacker  does  not  design  and 
supply  modified  packets  with  complete  knowledge  of  other 
nodes’  packets 

■  Main  scheme: 

□  Use  a  polynomial  hash 

□  An  attacker  without  full  knowledge  of  the  traffic  will  have  low 
probability  of  being  able  to  match  the  hash 

□  The  hash  can  be  used  to  detect  an  attack  [HLKMEK04] 

One  can  further  use  such  a  hash  to  decode 


Byzantine  correction 


Dj=Tj(1 ).  1  +Tj(2).r+. .  .+Tj(n(1-  e)).^1*  £> 


Block-length  n  over  finite  field  F 
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Di=Ti(1)’.1+Ti(2)’.r+...+Ti(n(1- £))’.rn<1-£>?  If  so,  accept  T,,  else  reject  T, 

Use  accepted  !>  to  decode  [Jaggi05],  [JLHE05],  [JKLHDKM07] 


Network  coding  and  network  error 
correction 


Random  network  coding  is  susceptible  to  modifications  of  packets  (adversary, 
jamming,  non-hostile,  packet  erasures) 

Error  correction  in  combination  with  network  coding  was  considered  by  Yeung  et  al., 
and  Zhang  -  here  the  network  topology  plays  a  central  role 

Work  in  the  context  of  Byzantine  modifications  in  arbitrary  networks:  Ho  et  al.  and 
Jaggi  et  al. 

We  consider  a  network  as  a  modeled  by  a  random  linear  operator  reflecting  the 
Operation  of  random  network  coding  on  a  network  of  unknown  topology 


Operator  Channel:  Input  is  a  subspace  V  of  ambient  n-dimensional  space  W, 

H  is  a  random  linear  operator  mapping  V  to  a  k-dimensional 
subspace  of  V;  E  is  an  error  space  of  dimension  t(E) 

Output  is  a  subspace  U  of  W 

U  = 

U 


nk(V)(BE 


This  formulation  is  very  similar  to  non-coherent  detection  in  the  Ml  MO  case:  Zheng  and  Tse 


Network  coding  and  network  error 
correction 

Just  as  in  the  MIMO  case:  Constructing  codes  is  equivalent  to 
packing  subspaces  of  dimension  An  in  ambient  space  of  dimension  n. 

The  metric  for  defining  distance  between  two  spaces  A,B  is 

d(A ,  B)  :=  dim(-4  0  B)  —  dim(-4  n  B) 

Equivalent  to  finding 
codes  in  the 
Grassmannian  graph 
(q-Johnson  scheme) 

Injection  of  an  error  space  E 

of  dimension  t  _ 


Input:  arbitrary  basis  vector 
for  a  chosen  space  U 
of  dimension  L 


NETWORK 


Different  modes  of  operation: 

n  =  dim(u  )-  dim(u  Df)  is  the  number  of  “erasures” 


Output:  basis  vectors 
for  a  space  Y 


t  =  dimjrjH dimjr  HE/)  is  the  number  of  “errors” _ 

We  can  correct  any  number  of  errors  and  erasures  as  long  as  l{n+t)<  D 


Content  distribution  of  large  files 


■  Distribution  of  large  files  to  many  users. 

■  Traditional  solutions  are  based  on  a  client-server 
model. 

■  Alternative  technique  -  P2P  swamping. 

■  Example  -  BitTorrent 

□  Divide  file  into  many  pieces. 

□  Client  requests  different  pieces  from  server(s)  or  other 
users. 

□  Client  becomes  server  to  pieces  downloaded. 

□  When  a  client  has  obtained  all  pieces,  re-construct  the 
whole  file. 

□  Problem:  hard  to  do  optimal  scheduling  of  pieces  to  nodes. 


Content  distribution  using  network 


■  Use  network  coding  to  increase  the  efficiency  of 
network  coding  in  a  P2P  cooperative  architecture 
[ADMK05],  [DPR05],  [GR05],  [DGWR07] 

■  Instead  of  storing  pieces  on  servers,  store  random 
linear  combination  of  the  pieces  on  servers 

■  Clients  also  generate  random  linear  combination  of 
the  pieces  they  have  received  to  send  out 

■  When  a  client  has  accumulated  enough  degrees  of 
freedom,  decode  to  obtain  the  whole  file 


Content  distribution  using  network 
coding 
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A  malicious  user  can  send  packets  with  valid  linear  combination  in  the  header, 
but  garbage  in  the  payload 

The  pollution  of  packets  spreads  quickly 

Need  a  homomorphic  signature  scheme  that  allows  nodes  to  verify  any  linear 
combination  of  pieces  without  contacting  the  original  sender  or  decoding  packets 


Verification  for  content  distribution 


■  Can  use  homomorphic  hash  functions  in  content  distribution 
systems  to  detect  polluted  packets  [ADMK05],  [GR06] 

■  Can  use  Secure  Random  Checksum  (SRC)  which  requires  less 
computation  than  the  homomorphic  hash  function,  but  requires  a 
secure  channel  to  transmit  the  SRCs  to  all  the  nodes  in  the 
network  [KFM06] 

■  A  signature  scheme  without  a  secure  channel  for  transmitting 
hash  values  and  associated  digital  signatures  of  received  and 
transmitted  blocks; 


□  Weil  pairing  on  elliptic  curves  provides  authentication  of  the  data 
in  addition  to  pollution  [CJL06; 


□  Use  a  scheme  that  relies  on  the  network  coding  scheme 
intrinsically  [ZKMH07] 


Problem  formulation 


■  A  source  s  wishes  to  send  a  large  file  to  a  group  of 
peers,  T 

■  View  the  data  to  be  transmitted  as  vectors  v, . vffl  in 

^-dimensional  vector  space  Fp,  where  p  is  a  prime.  The 
source  node  augments  these  vectors  to  v, , . . . ,  vm  given 
by 


where  the  first  m  elements  are  zero  except  the  z'-th  one  is 
1,  and  Vy£Fp 

•  Each  packet  received  by  a  peer  is  a  linear  combination 


of  all  the  pieces. 


m 


Signature  for  network  coding 


■  The  vectors  span  a  subspace  V of  F"p'+n. 

■  A  received  packet  is  a  valid  linear  combination  if  and 
only  if  it  belongs  to  V. 

•  Each  node  verifies  the  integrity  of  a  received  vector 
w  by  checking  the  membership  of  w  in  V. 


Our  approach  has  the  following  ingredients: 

a  q:  a  large  prime  such  that p  is  a  divisor  of  q  -1. 

□  g:  a  generator  of  the  group  G  of  order  p  in  Fq . 

□  Private  key:  K  ,r  =  m+„  ,  a  random  set  of  elements 


in  F 


□  Public  key:  Kpu  =  {h,  =  g*' 


m+n 


Signature  for  network  coding 


£§2?-'* 

Sk# 

LjkSt  I 


ie  scheme  works  as  follows: 

The  source  finds  a  vector  u  that  is  orthogonal  to  all 
vectors  in  V. 

The  source  computes  vector  x  =  (w,  / ax,...,um+n  / am+n ) . 

The  source  signs  x  with  some  standard  signature 
scheme  and  publishes  it. 

When  a  node  receives  a  vector  w  and  wants  to  verify 
that  w  is  in  V,  it  computes 


£§2?-'* 

Sk# 

LjkSt 1 


and  verifies  that  d=  1. 


Discussion 


It  can  be  shown  that  it  is  as  hard  as  the  (p ,  m,  m+n)  Diffie-Hellman 
problem 

■  Thus,  it  is  as  hard  as  the  Discrete  Logarithm  problem  to  find  new 
vectors  that  also  satisfy  the  verification  criterion  other  than  those 
that  are  in  F[BF99] 

■  Overheads 

□  Part  of  the  public  key  K  has  to  be  re-generated  for  each  file, 
otherwise  a  malicious  node  can  use  the  information  from  the 
previous  file 

□  Signature  vector, 

■  If  the  file  sizes  are  large,  after  the  initial  setup,  each  additional  file 
distributed  only  incurs  a  negligible  amount  of  overhead  using  our 
signature  scheme 

■  Our  signature  scheme  has  to  be  applied  on  the  original  file,  not 
on  hashes. 


Looking  forward 


■  The  free  cypher  of  network  coding  can  lead  to  new  ways 
of  managing  security  in  the  networks: 

□  Using  partial  knowledge  of  a  file  as  a  cypher 

□  Rate-based  security  in  networks? 

■  Network  coding  for  pollution  detection  and  correction: 

□  How  do  we  locate  attackers? 

□  What  is  the  effect  of  incorrect  network  management? 

□  Connection  with  network  tomography  using  network  coding 
[FM05,  06] 

■  Verification  in  network  coding: 

□  Can  we  find  further  network  coding  specific  schemes? 

□  Can  we  use  schemes  on  hashes? 

□  What  are  the  interactions  with  free  cyphers  in  networks? 
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Abstract — Network  coding  substantially  increases  network 
throughput.  But  since  it  involves  mixing  of  information  inside 
the  network,  a  single  corrupted  packet  generated  by  a  malicious 
node  can  end  up  contaminating  all  the  information  reaching  a 
destination,  preventing  decoding. 

This  paper  introduces  distributed  polynomial-time  rate-optimal 
network  codes  that  work  in  the  presence  of  Byzantine  nodes.  We 
present  algorithms  that  target  adversaries  with  different  attacking 
capabilities.  When  the  adversary  can  eavesdrop  on  all  links  and 
jam  zo  links,  our  first  algorithm  achieves  a  rate  of  C  -  2zo,  where 
C  is  the  network  capacity.  In  contrast,  when  the  adversary  has  lim¬ 
ited  eavesdropping  capabilities,  we  provide  algorithms  that  achieve 
the  higher  rate  of  C  -  zo . 

Our  algorithms  attain  the  optimal  rate  given  the  strength  of 
the  adversary.  They  are  information-theorctically  secure.  They 
operate  in  a  distributed  manner,  assume  no  knowledge  of  the 
topology,  and  can  be  designed  and  implemented  in  polynomial 
time.  Furthermore,  only  the  source  and  destination  need  to  be 
modified;  nonmalicious  nodes  inside  the  network  are  oblivious  to 
the  presence  of  adversaries  and  implement  a  classical  distributed 
network  code.  Finally,  our  algorithms  work  over  wired  and  wire¬ 
less  networks. 

Index  Terms — Byzantine  adversaries,  distributed  network 
error-correcting  codes,  eavesdroppers,  information-theoretically 
optimal,  list  decoding,  polynomial-time  algorithms. 


I.  Introduction 

NETWORK  coding  allows  the  routers  to  mix  the  infor¬ 
mation  content  in  packets  before  forwarding  them.  This 
mixing  has  been  theoretically  proven  to  maximize  network 
throughput  [1],  [23],  [21],  [15].  It  can  be  done  in  a  distributed 
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manner  with  low  complexity,  and  is  robust  to  packet  losses  and 
network  failures  [10].  [25].  Furthermore,  recent  implementa¬ 
tions  of  network  coding  for  wired  and  wireless  environments 
demonstrate  its  practical  benefits  [18],  [8]. 

But  what  if  the  network  contains  malicious  nodes?  A  ma¬ 
licious  node  may  pretend  to  forward  packets  from  source  to 
destination,  while  in  reality  it  injects  corrupted  packets  into 
the  information  flow.  Since  network  coding  makes  the  routers 
mix  packets’  content,  a  single  corrupted  packet  can  end  up 
corrupting  all  the  information  reaching  a  destination.  Unless 
this  problem  is  solved,  network  coding  may  perform  much 
worse  than  pure  forwarding  in  the  presence  of  adversaries. 

The  interplay  of  network  coding  and  Byzantine  adversaries 
has  been  examined  by  a  few  recent  papers.  Some  detect  the  pres¬ 
ence  of  an  adversary  [  1 2],  others  correct  the  errors  he  injects  into 
the  codes  under  specific  conditions  [9],  [14],  [22],  [31],  and  a 
few  bound  the  maximum  achievable  rate  in  such  adverse  envi¬ 
ronments  [3],  [29].  But  attaining  optimal  rates  using  distributed 
and  low-complexity  codes  was  an  open  problem. 

This  paper  designs  distributed  polynomial-time  rate-optimal 
network  codes  that  combat  Byzantine  adversaries.1  We  present 
three  algorithms  that  target  adversaries  with  different  strengths. 
The  adversary  can  inject  zo  packets  per  unit  time,  but  his  lis¬ 
tening  power  varies.  When  the  adversary  is  omniscient,  i.e.,  he 
observes  transmissions  on  the  entire  network,  our  codes  achieve 
the  rate  of  C-2zo,  with  high  probability.  When  the  adversary’s 
knowledge  is  limited,  either  because  he  eavesdrops  only  on  a 
subset  of  the  links  or  the  source  and  destination  have  a  low-rate 
secret  channel,  our  algorithms  deliver  the  higher  rate  of  C  —  zo- 

The  intuition  underlying  all  of  our  algorithms  is  that  the  ag¬ 
gregate  packets  from  the  adversarial  nodes  can  be  thought  of  as 
a  second  source.  The  information  received  at  the  destination  is  a 
linear  transform  of  the  source’s  and  the  adversary’s  information. 
Given  enough  linear  combinations  (enough  coded  packets),  the 
destination  can  decode  both  sources.  The  question  however  is 
how  does  the  destination  distill  out  the  source’s  information 
from  the  received  mixture.  To  do  so,  the  source’s  information 
has  to  satisfy  certain  constraints  that  the  attacker’s  data  cannot 
satisfy.  This  can  be  done  by  judiciously  adding  redundancy  at 
the  source.  For  example,  the  source  may  add  parity  checks  on 
the  source’s  original  data.  The  receiver  can  use  the  syndrome  of 
the  received  packets  to  determine  the  effect  of  the  adversary’s 
transmissions.  The  challenge  addressed  herein  is  to  design  the 
parity  checks  for  distributed  network  codes  that  achieve  the  op¬ 
timal  rates. 

independently  and  concurrently  to  our  work,  Koetter  and  Kschischang  [19] 
present  results  of  similar  nature  which  are  discussed  in  detail  in  Section  II 
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Conceptually,  our  proof  involves  two  steps.  We  first  analyze 
standard  network  coding  in  the  presence  of  Byzantine  adver¬ 
saries  (without  adding  additional  redundancy  at  the  source).  In 
this  setting,  as  expected,  destination  nodes  cannot  uniquely  de¬ 
code  the  source’s  data,  however,  we  show  that  they  can  list  de¬ 
code  this  data.  Namely,  receivers  can  identify  a  short  list  of  po¬ 
tential  messages  that  may  have  been  transmitted.  Once  this  is 
established,  we  analyze  the  effect  of  redundancy  at  the  source 
in  each  one  of  our  scenarios  (omniscient  or  limited  adversaries). 

This  paper  makes  several  contributions.  The  algorithms  pre¬ 
sented  herein  are  distributed  algorithms  with  polynomial-time 
complexity  in  design  and  implementation,  yet  are  rate-optimal. 
In  fact,  since  pure  forwarding  is  a  special  case  of  network 
coding,  being  rate-optimal,  our  algorithms  also  achieve  a 
higher  rate  than  any  approach  that  does  not  use  network  coding. 
They  assume  no  knowledge  of  the  topology  and  work  in  both 
wired  and  wireless  networks.  Furthermore,  implementing  our 
algorithms  involves  only  a  slight  modification  of  the  source  and 
receiver  while  the  internal  nodes  can  continue  to  use  standard 
network  coding. 

II.  Related  Work 

Work  on  network  coding  started  with  a  pioneering  paper  by 
Ahlswede  et  ai  [1],  which  establishes  the  value  of  coding  in 
the  routers  and  provides  theoretical  bounds  on  the  capacity  of 
such  networks.  The  combination  of  [23],  [21],  and  [15]  shows 
that,  for  multicast  traffic,  linear  codes  achieve  the  maximum 
capacity  bounds,  and  both  design  and  implementation  can  be 
done  in  polynomial  time.  Additionally,  Ho  et  al.  show  that  the 
above  is  true  even  when  the  routers  perform  random  linear  op¬ 
erations  [10].  Researchers  have  extended  the  above  results  to  a 
variety  of  areas  including  wireless  networks  [25],  [17],  [18],  en¬ 
ergy  [28],  secrecy  [2],  content  distribution  [8],  and  distributed 
storage  [16].  For  a  couple  of  nice  surveys  on  network  coding 
see,  e.g.,  [30],  [7]. 

A  Byzantine  attacker  is  a  malicious  adversary  hidden  in  a  net¬ 
work,  capable  of  eavesdropping  and  jamming  communications. 
Prior  research  has  examined  such  attacks  in  the  presence  of  net¬ 
work  coding  and  without  it.  In  the  absence  of  network  coding, 
Dolev  et  al  [5]  consider  the  problem  of  communicating  over  a 
known  graph  containing  Byzantine  adversaries.  They  show  that 
for  k  adversarial  nodes,  reliable  communication  is  possible  only 
if  the  graph  has  more  than  2k  -F 1  vertex  connectivity.  Subrama- 
niam  extends  this  result  to  unknown  graphs  [27].  Pelc  et  al.  ad¬ 
dress  the  same  problem  in  wireless  networks  by  modeling  mali¬ 
cious  nodes  as  locally  bounded  Byzantine  faults,  i.e.,  nodes  can 
overhear  and  jam  packets  only  in  their  neighborhood  [26]. 

The  interplay  of  network  coding  and  Byzantine  adversaries 
was  examined  in  [12],  which  detects  the  existence  of  an  adver¬ 
sary  but  does  not  provide  an  error-correction  scheme.  The  work 
of  Cai  and  Yeung  [2],  [29],  [3]  generalizes  standard  bounds  on 
error-correcting  codes  to  networks,  without  providing  any  ex¬ 
plicit  algorithms  for  achieving  these  bounds.  Our  work  presents 
a  constructive  design  to  achieve  those  bounds. 

The  problem  of  efficiently  correcting  errors  in  the  presence  of 
both  network  coding  and  Byzantine  adversaries  has  been  con¬ 
sidered  by  a  few  prior  proposals.  Earlier  work  [22],  [9]  assumes 


a  centralized  trusted  authority  that  provides  hashes  of  the  orig¬ 
inal  packets  to  each  node  in  the  network.  Charles  et  al.  [4]  ob¬ 
viates  the  need  for  a  trusted  entity  under  the  assumption  that 
the  majority  of  packets  received  by  each  node  is  uncorrupted. 
Recently,  Zhao  et  al.  [32]  have  demonstrated  error  detection  in 
the  public  key  cryptographic  setting.  In  contrast  to  the  above 
schemes  which  are  cryptographically  secure,  in  a  previous  work 
[14],  we  consider  an  information-theoretically  rate-optimal  so¬ 
lution  to  Byzantine  attacks  for  wired  networks,  which  however 
requires  a  centralized  design.  This  paper  builds  on  the  above 
prior  schemes  to  combine  their  desirable  traits;  it  provides  a  dis¬ 
tributed  solution  that  is  information-theoretically  rate  optimal 
and  can  be  designed  and  implemented  in  polynomial  time.  Fur¬ 
thermore,  our  algorithms  have  new  features;  they  assume  no 
knowledge  of  the  topology,  do  not  require  any  new  function¬ 
ality  at  internal  nodes,  and  work  for  both  wired  and  wireless 
networks. 

The  work  closest  in  spirit  to  our  work  is  that  of  Koetter  and 
Kschischang  [19],  who  also  studied  the  presence  of  Byzantine 
adversaries  in  the  distributed  network  coding  setting.  They 
concentrate  on  communicating  against  an  omniscient  adver¬ 
sary,  and  present  a  distributed  scheme  of  optimal  rate  C  -  2 zo. 
The  proof  techniques  of  [19]  differ  substantially  from  those 
presented  in  this  work.  In  a  nutshell,  Koetter  and  Kschischang 
reduce  the  model  of  network  coding  to  a  certain  point-to-point 
channel.  They  then  construct  generalizations  of  Reed-Solomon 
codes  for  this  channel,  which  enables  the  authors  to  construct 
deterministic  network  error-correcting  codes  as  mentioned 
above. 

We  would  like  to  note  that  the  abstraction  used  in  [19]  (al¬ 
though  very  elegant)  comes  at  a  price.  It  does  not  encapsulate 
the  additional  Byzantine  scenarios  that  arise  naturally  in  prac¬ 
tice  and  are  addressed  in  our  current  paper  (i.e.,  adversaries  of 
limited  knowledge,  discussed  in  Sections  VI  and  VIII).  More 
specifically,  our  protocol  enables  us  to  attain  the  higher  rate  of 
C  -  zo ,  albeit  only  under  the  (weaker)  requirement  of  list  de¬ 
coding.  List  decoding  in  the  setting  of  network  communication 
is  a  central  ingredient  in  our  proofs  for  limited  adversaries.  To 
the  best  of  our  current  knowledge,  the  abstraction  of  [19]  (al¬ 
though  based  on  Reed-Solomon  like  codes)  docs  not  allow  ef¬ 
ficient  list  decoding. 

III.  Model  and  Definitions 

We  use  a  general  model  that  encompasses  both  wired  and 
wireless  networks.  To  simplify  notation,  we  consider  only  the 
problem  of  communicating  from  a  single  source  to  a  single  des¬ 
tination.  But  similarly  to  most  network  coding  algorithms,  our 
techniques  generalize  to  multicast  traffic. 

A.  Threat  Model 

There  is  a  source,  Alice,  who  communicates  over  a  wired  or 
wireless  network  to  a  receiver  Bob.  There  is  also  an  attacker 
Calvin,  hidden  somewhere  in  the  network.  Calvin  aims  to  pre¬ 
vent  the  transfer  of  information  from  Alice  to  Bob,  or  at  least 
to  minimize  it.  He  can  observe  some  or  all  of  the  transmissions, 
and  can  inject  his  own.  When  he  injects  his  own  data,  he  pre¬ 
tends  they  are  part  of  the  information  flow  from  Alice  to  Bob. 
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n  -  packet  size 


W]l 


B-  Batch  Size 


8n  -  redundant  symbols 


n  -  packet  size 


-i  I  ♦  No.  of  packets 
*  *  0  Calvin  injects 


n  -  packet  size 


C  -  Network  Capacity 


Fig.  t.  Alice,  Bob.  and  Calvin's  information  matrices. 


Calvin  is  quite  strong.  He  is  computationally  unbounded.  He 
knows  the  encoding  and  decoding  schemes  of  Alice  and  Bob, 
and  the  network  code  implemented  by  the  interior  nodes.  He 
also  knows  the  exact  network  realization. 

B.  Network  and  Code  Model 

Network  Model:  The  network  is  modeled  as  a  hypergraph 
[24].  Each  transmission  carries  a  packet  of  data  over  a  hyper¬ 
edge  directed  from  the  transmitting  node  to  the  set  of  observer 
nodes.  The  hypergraph  model  captures  both  wired  and  wire¬ 
less  networks.  For  wired  networks,  the  hyperedge  is  a  simple 
point-to-point  link.  For  wireless  networks,  each  such  hyperedge 
is  determined  by  instantaneous  channel  realizations  (packets 
may  be  lost  due  to  fading  or  collisions)  and  connects  the  trans¬ 
mitter  to  all  nodes  that  hear  the  transmission.  The  hypergraph  is 
unknown  to  Alice  and  Bob  prior  to  transmission. 

Source:  Alice  generates  incompressible  data  that  she  wishes 
to  deliver  to  Bob  over  the  network.  To  do  so,  Alice  encodes  her 
data  as  dictated  by  the  encoding  algorithm  (described  in  subse¬ 
quent  sections).  She  divides  the  encoded  data  into  batches  of  b 
packets.  For  clarity,  we  focus  on  the  encoding  and  decoding  of 
one  batch. 

A  packet  contains  a  sequence  of  n  symbols  from  the  finite 
field  Fq .  All  arithmetic  operations  henceforth  are  done  over 
symbols  from  Fq.  (See  the  treatment  in  [20].)  Out  of  the  n  sym¬ 
bols  in  Alice’s  packet.  Sn  symbols  are  redundancy  added  by  the 
source. 

Alice  organizes  the  data  in  each  batch  into  a  matrix  A'  as 
shown  in  Fig.  1.  We  denote  the  (i,j) th  element  in  the  matrix 
by  The  ith  row  in  the  matrix  Ar  is  just  the  ith  packet 

in  the  batch.  Fig.  1  shows  that  similarly  to  standard  network 
codes  [10],  some  of  the  redundancy  in  the  batch  is  devoted 
to  sending  the  identity  matrix  /.  Also,  as  in  [10],  Alice  lakes 
random  linear  combinations  of  the  rows  of  A”  to  generate  her 
transmitted  packets.  As  the  packets  traverse  the  network,  the  in¬ 
ternal  nodes  apply  a  linear  transform  to  the  batch.  The  identity 
matrix  receives  the  same  linear  transform.  The  destination  dis¬ 
covers  the  linear  relation,  denoted  by  the  matrix  T,  between  the 
packets  it  receives  and  those  transmitted.  This  is  done  by  in¬ 
specting  how  /  was  transformed. 

Adversary':  Let  the  matrix  Z  be  the  information  Calvin 
injects  into  each  batch.  The  size  of  this  matrix  is  zo  x  n,  where 
zo  is  the  number  of  edges  controlled  by  Calvin  (alternatively, 
one  may  define  zo  to  be  the  size  of  the  min-cut  from  Calvin 


to  the  destination).  In  some  of  our  adversarial  models  we  limit 
the  eavesdropping  capabilities  of  Calvin.  Namely,  we  limit  the 
number  of  transmitted  packets  Calvin  can  observe.  In  such 
cases,  this  number  will  be  denoted  by  zj. 

Receiver:  Analogously  to  how  Alice  generates  X\  the  re¬ 
ceiver  Bob  organizes  the  received  packets  into  a  matrix  Y .  The 
?th  received  packet  corresponds  to  the  ith  row  of  Y .  Note  that  the 
number  of  received  packets,  and  therefore  the  number  of  rows 
of  V,  is  a  variable  dependent  on  the  network  topology.  Bob  at¬ 
tempts  to  reconstruct  Alice’s  information  X,  using  the  matrix 
of  received  packets  Y. 

As  mentioned  in  the  Introduction,  conceptually.  Bob  recovers 
the  information  of  Alice  in  two  steps.  First,  Bob  identifies  a  set 
of  linear  constraints  which  must  be  satisfied  by  the  transmitted 
information  A"  of  Alice.  This  set  of  constraints  characterizes  a 
linear  subspace  of  /cm  dimension  in  which  A"  must  lie.  We  refer 
to  this  low-dimensional  subspace  as  a  linear  list  decoding  of  Ar. 
Once  list  decoding  is  accomplished,  unique  decoding  follows 
by  considering  additional  information  Bob  has  on  the  matrix 
A'  (such  as  its  redundancy,  or  information  transmitted  by  Alice 
over  a  low  rate  secret  channel). 

Network  Transform:  The  network  performs  a  classical  dis¬ 
tributed  network  code  [10].  Specifically,  each  packet  transmitted 
by  an  internal  node  is  a  random  linear  combination  of  its  in¬ 
coming  packets.  Thus,  the  effect  of  the  network  at  the  destina¬ 
tion  can  be  summarized  as  follows: 


y  =  TX  +  T'Z. 


This  can  be  written  as 


Y  =  [TIT*] 


X 

z 


(1) 

(2) 


where  Ar  is  the  batch  of  packets  sent  by  Alice,  Z  refers  to  the 
packets  Calvin  injects  into  Alice's  batch,  and  Y  is  the  received 
batch.  The  matrix  T  refers  to  the  linear  transform  from  Alice  to 
Bob,  while  V  refers  to  the  linear  transform  from  Calvin  to  Bob. 
Notice  that  neither  T  nor  V  are  known  to  Bob.  Rather,  as  shown 
in  Fig.  1,  Bob  receives  the  matrix  T,  which  cannot  be  directly 
used  to  recover  A". 

Notice  that  in  our  model  the  error  imposed  by  the  Byzantine 
adversary  Calvin  is  assumed  to  be  added  to  the  original  informa¬ 
tion  transmitted  on  the  network.  One  can  also  consider  a  model 
in  which  these  errors  overwrite  the  existing  information  trans¬ 
mitted  by  Alice.  We  stress  that  if  Calvin  is  aware  of  transmis¬ 
sions  on  links,  these  two  models  are  equivalent.  Overwriting  a 
message  with  Z  is  equivalent  to  adding  -Xz  -F  Z  on  the  links 
controlled  by  Calvin,  where  Xz  represents  the  original  trans¬ 
missions  on  those  links. 

Definitions:  Table  I  lists  notation  needed  for  our  main  re¬ 
sults.  We  define  the  following  concepts. 

•  The  netw  ork  capacity ,  denoted  by  C,  is  the  time  average 
of  the  maximum  number  of  packets  that  can  be  delivered 
from  Alice  to  Bob,  assuming  no  adversarial  interference, 
i.e.,  the  max  flow.  It  can  be  also  expressed  as  the  min-cut 
from  source  to  destination.  (For  the  corresponding  multi¬ 
cast  case,  C  is  defined  as  the  minimum  of  the  min-cuts  over 
all  destinations.) 
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TABLE  I 

Terms  Used  in  the  Paper 


Variable 

Definition 

C 

Network  capacity 

zo 

Number  of  packets  Calvin  can  inject. 

ZI 

Number  of  packets  Calvin  can  hear. 

Number  of  packets  in  a  batch  ’ 

n 

Length  of  each  packet. 

6 

Alice’s  redundancy. 

"Throughout  this  work  6  is  defined  as  C  -  zq 


•  The  error  probability  is  the  probability  that  Bob’s  recon¬ 
struction  of  Alice’s  information  is  inaccurate. 

•  The  rate  R  is  the  number  of  information  symbols  that  can 
be  delivered  on  average,  per  time  step,  from  Alice  to  Bob. 
Rate  R  is  said  to  be  achievable  if  for  any  ci  >  0  and  oi  >  0 
there  exists  a  coding  scheme  of  block  length  n  with  rate 
>  /?  —  c 2  and  error  probability  <  t\. 


Theorem  3:  If  zj  <  C  -  2zo ,  the  Limited  Adversary  algo¬ 
rithm  achieves  an  optimal  rate  of  C  —  zo  with  code-complexity 
<D(nC3). 

D.  Linear  List  Decoding  Model 

A  key  building  block  in  some  of  our  proofs  is  a  linear  list 
decoding  algorithm.  The  model  assumes  the  Omniscient  Ad¬ 
versary  of  Section  IV-B.  We  design  a  code  that  Bob  can  use  to 
output  a  linear  list  (of  low  dimension)  that  is  guaranteed  to  con¬ 
tain  Alice’s  message  X.  The  list  is  then  refined  to  obtain  the 
results  stated  in  Theorems  1-3.  In  Section  V  we  prove  the  fol¬ 
lowing. 

Theorem  4:  The  Linear  List  Decoding  algorithm  achieves  a 
rate  of  C  -  zo  and  outputs  a  list  L  that  is  guaranteed  to  contain 
X .  The  list  L  is  a  vector  space  of  dimension  b(b  4-  zo).  The 
code-complexity  is  0(nC3). 


IV.  Summary  of  Results 

We  have  three  main  results.  Each  result  corresponds  to 
a  distributed,  rate-optimal,  polynomial-time  algorithm  that 
defeats  an  adversary  of  a  particular  type.  The  optimality  of 
these  rates  has  been  proven  by  prior  work  [2],  [3],  [29],  [14]. 
Our  work,  however,  provides  a  construction  of  distributed 
codes/algorithms  that  achieve  optimal  rates.  To  prove  our 
results,  we  first  study  the  scenario  of  high  rate  list  decoding  in 
the  presence  of  Byzantine  adversaries.  In  what  follows,  let  |7j 
denote  the  number  of  receivers,  and  |£|  denote  the  number  of 
(hyper)-edges  in  the  network. 

A.  Shared  Secret  Model 

This  model  considers  the  transmission  of  information  via  net¬ 
work  coding  in  a  network  where  Calvin  can  observe  all  trans¬ 
missions,  and  can  inject  zo  corrupt  packets.  However,  it  is  as¬ 
sumed  that  Alice  can  transmit  to  Bob  a  message  (at  asymptoti¬ 
cally  negligible  rate)  which  is  unknown  to  Calvin  over  a  separate 
secret  channel.  In  Section  VI,  we  prove  the  following. 


V.  Linear  List  Decoding  in  the 
Omniscient  Adversary  Model 

Here  we  assume  we  face  an  omniscient  adversary,  i.e.,  Calvin 
can  observe  everything,  and  there  are  no  shared  secrets  between 
Alice  and  Bob.  We  design  a  code  that  Bob  can  use  in  this  sce¬ 
nario  to  output  a  linear  list  (of  low  dimension)  that  is  guaranteed 
to  contain  Alice’s  message  X.  Our  algorithm  achieves  a  rate  of 
R  =  C  —  zo.  The  corrupted  information  Y  Bob  receives  en¬ 
ables  him  to  deduce  a  system  of  linear  equations  that  A'  satis¬ 
fies.  This  system  of  equations  ensures  that  X  lies  in  a  low-di¬ 
mensional  vector  space.  We  now  present  our  algorithm  in  detail. 
Throughout  this  and  upcoming  sections,  b  is  fixed  as  C  -  zo- 

A.  Alice's  Encoder 

Alice’s  encoder  is  quite  straightforward.  She  simply  arranges 
the  source  symbols  into  the  b  x  n  matrix  A\  appended  with  a 
6-dimensional  identity  matrix.  She  then  implements  the  clas¬ 
sical  random  network  encoder  described  in  Section  III-B  to  gen¬ 
erate  her  transmitted  packets. 


Theorem  I:  The  Shared  Secret  algorithm  achieves  an  optimal 
rate  of  C  -  zo  with  code-complexity  0(nC3). 

R .  Omniscient  Adversary  Model 

This  model  assumes  an  omniscient  adversary,  i.e.,  one  from 
whom  nothing  is  hidden.  As  in  the  Shared  Secret  model,  Calvin 
can  observe  all  transmissions,  and  can  inject  zo  corrupt  packets. 
However,  Alice  and  Bob  have  no  shared  secrets  hidden  from 
Calvin.  In  Section  VII,  we  prove  the  following. 

Theorem  2:  The  Omniscient  Adversary  algorithm  achieves 
an  optimal  rate  of  C  —  2 zo  with  code-complexity  C9((nC)3). 

C.  Limited  Adversary  Model 

In  this  model,  Calvin  is  limited  in  his  eavesdropping  power; 
he  can  observe  at  most  zi  transmitted  packets.  Exploiting  this 
weakness  of  the  adversary  results  in  an  algorithm  that,  like  the 
Omniscient  Adversary  algorithm,  operates  without  a  shared  se¬ 
cret.  In  Section  VIII.  we  prove  the  following. 


R.  Rob 's  Decoder 


Bob  selects  6  4-  zo  linearly  independent  columns  of  V\  and 
denotes  the  corresponding  matrix  Y *.  Here  we  assume,  without 
loss  of  generality  (w.l.o.g.),  that  the  column  rank  of  Y  is  indeed 
6  4-  z0 .  The  column  rank  cannot  be  larger  than  6  4-  zo  by  (2). 
If  the  column  rank  happens  to  be  r  <  6  H-  zo ,  Bob  selects  r 
independent  rows  of  Y  and  continues  in  a  procedure  analogous 
to  that  described  below.  We  also  assume  that  Y*  contains  the 
last  6  columns  of  Y  (corresponding  to  Alice’s  6-dimensional 
identity  matrix).  This  is  justified  due  to  (2)  and  the  assumption 
(discussed  below)  that  the  intersection  of  the  column  spans  of 
T  and  V  is  trivial,  i.e.,  [TjT']  is  regular  (with  high  probability 
over  the  random  choices  of  internal  nodes  in  the  network).  The 
remaining  zo  columns  of  Ys  are  chosen  arbitrarily  so  that  YB 
is  invertible.  The  columns  of  X  and  Z  corresponding  to  those 
in  Y9  are  denoted  X9  and  Z9 ,  respectively.  By  (2), 


Y 9  =  [ r\r ] 


X 9 
Z 9 
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Also,  since  Y*  acts  as  a  basis  for  the  columns  of  Y ,  we  can 
write  Y  =  Y*F  for  some  matrix  F.  Bob  can  compute  F  as 
Y.  Therefore,  Y  can  also  be  written  as 


Y  =  [TIT'] 


X*F 

Z*F 


(3) 


Comparing  (2)  and  (3),  and  again  using  the  assumption  that 
[T|T'J  is  invertible  (with  high  probability)  gives  us 


X=X*F  (4) 

Z=Z*F.  (5) 

In  particular,  (4)  gives  a  linear  relationship  on  A'  that  can 
be  leveraged  into  a  list-decoding  scheme  for  Bob  (the  corre¬ 
sponding  linear  relationship  from  (5)  is  not  very  useful).  The 
number  of  variables  in  Xs  is  b(b  +  zo).  Therefore,  the  entries 
of  the  matrix  Xa  span  a  vector  space  of  dimension  b(b  +  zo) 
over  Fv.  Bob’s  list  is  the  corresponding  b(b  4-  2o)-dimensional 
vector  space  L  spanned  by  X*F. 

The  only  source  of  error  in  our  argument  arises  if  the  intersec¬ 
tion  of  the  column-spans  of  T  and  V  is  nontrivial,  i.e„  if  [TIT'] 
is  singular.  But  as  shown  in  ( 1 1 J,  as  long  as  b+zo  <  C\  this  is  at 
most  \T\\£ \q~x  for  any  fixed  network.  Since  Calvin  can  choose 
his  locations  in  at  most  (J£j)  ways,  the  total  probability  of  error 
is  at  most  |  T\\£\q~l.  The  computational  cost  of  design,  en¬ 

coding  and  decoding  is  dominated  by  the  cost  of  computing  F 
and  thereby  a  representation  of  L.  This  takes  0(nC 3)  steps. 

Note:  In  the  Linear  List  Decoding  scheme  described  above, 
Alice  appends  an  identity  matrix  to  her  source  symbols  to  ob¬ 
tain  the  matrix  X ,  causing  (an  asymptotically  negligible)  loss 
in  rate.  This  is  also  the  standard  protocol  of  (10].  We  note  that 
our  scheme  works  just  as  well  even  if  Alice  does  not  append 
such  an  identity  matrix,  and  X  consists  solely  of  source  sym¬ 
bols.  However,  the  appended  identity  matrix  is  used  in  the  model 
of  Section  VII.  We  now  solve  (4)  under  different  assumptions  on 
Calvin’s  strength. 


Claim  5:  For  any  X '  ^  A"  the  probability  (over  r  j , . . . , ra) 
that  X'D  =  II  is  at  most  . 

Proof:  We  need  to  prove  that  (A"  -  X')D  ^  0  with  high 
probability,  where  0  is  the  zero  matrix.  As  X  ^  A''  there  is 
at  least  one  row  of  X  which  differs  from  X\  Assume  w.l.o.g. 
that  this  is  the  first  row,  denoted  here  as  the  nonzero  vector 
(xi, . .  The  jth  entry  in  the  first  row  of  (X  -  X')D  is 
F(rj)  =  XX=i  x*rj •  As  F(rj)  not  the  zero  polynomial,  the 
probability  (over  r/)  that  F(r7  )  =  0  is  at  most  This  holds  for 
all  entries  of  the  first  row  of  (A'  -  X')D.  Thus,  the  probability 
that  the  entire  row  is  the  zero  vector  is  at  most  •  □ 

Let  a  =  6(6  +  zo)  -f  1.  Let  L  be  a  list  (containing  X)  of 
distinct  matrices.  Let  the  size  of  L  be  q°~1. 

Corollary  6:  The  probability  (over  ri,...,r0)  that  there  ex¬ 
ists  A"  G  L  such  that  X'  /  X  but  X'D  =  XD  is  at  most 
na/q. 

Proof:  We  use  Claim  5,  and  the  union  bound  on  all  ele¬ 
ments  of  L  that  differ  from  AT.  □ 

Note:  The  secret  channel  is  essential  for  the  following  reason. 
If  the  symbols  rj , . . . ,  rtt  were  not  secret  from  Calvin,  he  could 
carefully  select  his  corrupted  packets  so  that  Bob’s  list  L  would 
indeed  contain  an  X'  ^  A“  such  that  X'D  =  XD. 

Bob  is  able  to  decode  the  original  information  X  of 
Alice.  Namely.  Corollary  6  establishes  that  the  system 
XD  =  XBFD  =  II  has  a  single  solution.  This  solution 
can  be  found  using  standard  Gaussian  elimination. 

The  above  implies  a  scheme  that  achieves  rate  C  -  zo  -  The 
optimality  of  this  rate  is  shown  in  prior  work  [14].  The  prob¬ 
ability  of  error  is  at  most  nQ/q  +  |T||£|(^)/</.  Here  a  = 
b(b  +  zo)  4-  1.  The  computational  cost  of  design,  encoding,  and 
decoding  is  dominated  by  the  cost  of  running  the  Linear  List 
Decoding  algorithm,  which  takes  time  0(nCz). 

VII.  Unique  Decoding  in  the 
Omniscient  Adversary  Model 


VI.  Shared  Secret  Model 

In  the  Shared  Secret  model  Alice  and  Bob  have  use  of  a  strong 
resource,  namely,  a  secret  channel  over  which  Alice  can  transmit 
a  small  amount  of  information  to  Bob  that  is  secret  from  Calvin. 
The  size  of  this  secret  is  asymptotically  negligible  in  n.  Note  that 
since  the  internal  nodes  mix  corrupted  and  uncorrupted  packets, 
Alice  cannot  just  sign  her  packets  and  have  Bob  check  the  signa¬ 
ture  and  throw  away  corrupted  packets — in  extreme  cases.  Bob 
may  not  receive  any  uncorrupted  packets. 

Alice  uses  the  secret  channel  to  send  a  random  hash  of  her 
data  to  Bob.  Bob  first  uses  the  list-decoding  scheme  of  Section  V 
to  obtain  a  low-dimensional  vector  space  L  containing  X.  He 
then  uses  Alice’s  hash  to  identify  X  from  L. 

Let  a  be  a  parameter  defined  below.  Let  rlt...,r0  be  a 
elements  of  Fq  chosen  at  random  by  Alice  (and  unknown  to 
Calvin).  Let  D  =  [rftJ]  be  an  n  x  nr  matrix  in  which  d(j  =  (rj)1. 
Let  XD  =  II.  Alice  sends  to  Bob  a  secret  S  comprising  of  the 
symbols  r\ , . . . ,  r0  and  the  matrix  II.  The  size  of  this  secret  is 
thus  a(a  4-  1),  which  is  asymptotically  negligible  in  n. 


We  now  consider  unique  decoding.  Our  algorithm  achieves 
a  rate  of  R :  =  C  -  2 zo,  which  is  lower  than  that  possible  in 
the  list  decoding  scenario.  Recent  bounds  [2],  [3]  on  network 
error-correcting  codes  show  that  in  fact  C-2zo  is  the  maximum 
achievable  rate  for  networks  with  an  omniscient  adversary. 

To  move  from  list  decoding  to  unique  decoding  in  the  omni¬ 
scient  model,  we  add  redundancy  to  Alice’s  information  as  fol¬ 
lows.  Alice  writes  her  information  A'  in  the  form  of  a  lengthen 
column  vector  X.  The  vector  X  is  chosen  to  satisfy  DX  =  0. 
Here.  D  is  a  6n  x  bn  matrix  defined  as  the  redundancy  matrix. 
The  matrix  D  is  obtained  by  choosing  each  element  as  an  in¬ 
dependent  and  uniformly  random  symbol  from  the  finite  field 
F9,  and  6n  >  n(zo  4-  £ )  for  arbitrarily  small  e.  This  choice  of 
parameters  implies  that  the  number  of  parity  checks  DX  =  0 
is  greater  than  the  number  of  symbols  in  the  zo  packets  that 
Calvin  injects  into  the  network.  We  show  that  this  allows  Bob 
to  uniquely  decode,  implying  a  rate  of  C—2zo .  The  redundancy 
matrix  D  is  known  to  all  parties — Alice,  Bob,  and  Calvin  — and 
hence  does  not  constitute  a  shared  secret. 

Alice  encodes  as  in  Section  V.  Bob’s  decoding  is  as  follows. 
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Bob  first  runs  the  Linear  List  Decoding  algorithm  to  obtain 
(4)  and  (5).  We  denote  the  matrix  comprising  of  the  first  zo  rows 
of  F  by  F\ ,  and  the  matrix  comprising  of  the  last  b  rows  of  F  by 
F2.  By  the  constraints  specified  in  Section  V,  the  last  b  columns 
of  X9  form  an  identity  matrix.  Thus,  (4)  transforms  into 

X  =  X{Fx  +  F2  (6) 

where  Arj*  comprises  of  the  first  zo  columns  of  Ar*. 

Recall  that  X  is  a  vector  corresponding  to  the  matrix  X.  Upon 
receiving  Y ,  Bob  computes  F  and  solves  the  system 

X  =  XfFi  +  F2  (7) 

DX=  0.  (8) 

Here,  only  D  and  F  are  known  to  Bob.  Our  goal  is  now  to  show 
that  with  high  probability  over  the  entries  of  the  matrix  D,  no 
matter  which  matrix  F  was  obtained  by  Bob,  there  is  a  unique 
solution  to  (7)  and  (8).  The  matrix  F  depends  on  the  errors  Z 
Calvin  injects.  Calvin  can  choose  these  to  depend  on  D.  We  take 
this  into  consideration  below. 

The  system  of  linear  equations  (7)-(8)  can  be  written  in  ma¬ 
trix  form  as 


AX  = 


D 


X  = 


B 


where  A  comprises  of  the  submatrices  A(Fi)  and  D ,  A(F\) 
is  a  bn  x  bn  matrix  whose  entries  depend  on  F\ ,  and  B  is  a 
length-n(b  +  fi)  vector.  It  holds  that  the  system  (7M8)  has  a 
unique  solution  if  and  only  if  A  has  full  column  rank.  However, 
Calvin  has  partial  control  over  F,  and  his  goal  is  to  design  his 
error  Z  so  this  will  not  be  the  case. 

In  what  follows,  we  show  that  Calvin  cannot  succeed. 
Namely,  we  show,  with  high  probability  over  the  entries  of  D , 
that  no  matter  what  the  value  of  F  is,  the  system  (7)-(8)  has 
a  unique  solution.  Our  proof  has  the  following  structure.  We 
first  show  that  for  a  fixed  Fi,  the  matrix  A  has  full  column  rank 
with  high  probability  over  D.  We  then  note  that  the  number 
of  possible  different  matrices  F\  is  at  most  qz°n  (this  follows 
from  the  size  of  Fi).  Finally,  applying  the  union  bound  we 
obtain  our  result. 

We  start  with  some  notation.  Assume  that  X  is  arranged  by 
stacking  the  columns  of  A'  one  on  top  of  the  other,  where  the 
columns  of  Ar|  appear  on  the  top  of  X.  Also,  we  fix  the  (i,j) th 
entry  of  F\  to  be  fa  .  Then,  the  matrix 


The  matrix  A  is  described  by  smaller  dimensional  matrices  as 
entries.  Namely,  the  identity  matrices  /  appearing  above  have 
dimension  6,  the  identity  matrix  I  has  dimension  b(n  —  zo), 
and  the  zero  matrix  0  has  dimension  zob  x  b(n  -  zo )•  We  now 
analyze  the  column  rank  of  A. 

Clearly,  the  last  b(n  -  z0)  columns  of  A  are  in¬ 
dependent.  Thus,  any  set  of  dependent  columns  of  A 
must  include  at  least  one  of  the  first  bzo  columns.  Let 

V  =  {uj . u(jzo;vu...,vb(n-Zo)}  be  the  set  of  columns 

of  A  (here  the  {u,-}  vectors  correspond  to  the  leftmost  bzo 
columns  of  A).  We  break  the  {u;}  and  {t;j}  vectors  into  two 
parts.  The  components  of  the  {ut}  and  {vj}  vectors  in  the  top 
bn  rows  of  A  are  denoted,  respectively,  as  {tx-}  and  {vj}.  The 
components  of  the  {tx,}  and  {vj}  vectors  in  the  bottom  fin 
rows  of  A  are  denoted,  respectively,  as  {ix-*}  and  {rj}.  The 
matrix  A  is  rank-deficient  if  and  only  if  there  exist  {a*}  and 
{0j},  not  all  zero,  such  that  ^-a.-tx,  +  Y!jPjvj  =  °*  Note 
that  there  is  a  one-to-one  correspondence  between  the  values 
{a,}  and  the  values  {0j}  in  the  above  equality.  Namely,  for 
each  setting  of  {a,},  there  is  a  unique  setting  of  {/?,}  for  which 
+  Ylj  Pjvj  =  0.  Further,  for  every  setting  of  the 
values  {o,}  (and  a  corresponding  setting  for  {flj}),  the  prob¬ 
ability  over  D  that  o^txj  -f  YljPjvj  =  0  is  at  most  q~*n. 
This  implies  that  the  probability  ^  atwt*  +  fyvj  =  0  is 
asymptotically  negligible.  Then,  an  additional  use  of  the  union 
bound  on  all  qbio  possible  values  of  {a, }  suffices  to  obtain  our 
proof. 

All  in  all.  Bob  fails  to  uniquely  decode  with  probability 
qz°nqbzoq~6n  (the  first  term  corresponds  to  the  union  bound 
over  the  values  of  F}  =  [fij]*  the  second  term  corresponds  to 
the  union  bound  over  the  values  of  {o»},  and  the  third  term  cor¬ 
responds  to  the  failure  probability).  Setting  fi  =  zo+e  suffices 
for  our  proof.  The  computational  cost  of  design,  encoding,  and 
decoding  is  dominated  by  solving  the  system  of  (7)— (8),  and 
thus  equals  (9((nC)3). 

VIII.  Limited  Adversary  Model 

In  this  section,  we  combine  the  strengths  of  the  Shared  Se¬ 
cret  and  the  Omniscient  Adversary  algorithms  of  Sections  VI 
and  VII,  respectively.  We  then  achieve  the  higher  rate  of  C-zo 
without  the  need  of  a  secret  channel.  The  caveat  is  that  Calvin  is 
more  limited — he  can  only  eavesdrop  on  part  of  the  edges  in  the 
network.  Specifically,  the  number  of  packets  he  can  transmit, 
zo,  and  the  number  he  can  eavesdrop  on,  z/,  satisfy  the  tech¬ 
nical  constraint 


has  the  following  form: 


(1  -  /,.,)/ 

-fi.il  . 

••  -ho,  ll 

-w 

-W  • 

(1  Izo,zo)I 

0 

-/l,so+lX 

-kn 

-fi.nl  .. 

—  /*o,n^ 

i 

D 


2 zo  +  zi  <  C.  (9) 

We  call  such  an  adversary  a  Limited  Adversary. 

The  main  idea  underlying  our  Limited  Adversary  algorithm 
is  simple.  Alice  uses  the  Omniscient  Adversary  algorithm  to 
transmit  a  “short,  scrambled”  message  to  Bob  at  rate  C  -  2 zo. 
By  (9),  the  rate  zj  at  which  Calvin  eavesdrops  is  strictly  less  than 
Alice’s  rate  of  transmission  C  -  2 zo*  Hence,  Calvin  cannot  de¬ 
code  Alice's  message,  but  Bob  can.  This  means  Alice’s  scram¬ 
bled  message  to  Bob  contains  a  secret  S  that  is  unknown  to 
Calvin.  Once  S  has  been  shared  from  Alice  to  Bob,  they  can 
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use  the  Shared  Secret  algorithm  to  transmit  the  bulk  of  Alice’s 
message  to  Bob  at  the  higher  rate  C  —  zo- 

A.  Alice's  Encoder 

Alice's  encoder  follows  essentially  the  schema  described  in 
the  previous  paragraph.  The  information  S  she  transmits  to  Bob 
via  the  Omniscient  Adversary  algorithm  is  padded  with  some 
random  symbols.  This  is  for  two  reasons.  First,  the  randomness 
in  the  padded  symbols  ensures  strong  information-theoretic  se¬ 
crecy  of  S.  That  is.  we  show  in  Claim  7  that  Calvin’s  best  es¬ 
timate  of  any  function  of  S  is  no  better  than  if  he  randomly 
guessed  the  value  of  the  function.  Second,  since  the  Omniscient 
Adversary  algorithm  has  a  probability  of  error  that  decays  ex¬ 
ponentially  with  the  size  of  the  input,  it  is  not  guaranteed  to  per¬ 
form  well  when  only  a  small  message  is  transmitted. 

Alice  divides  her  information  A’  into  two  parts  [A"i  A^].  She 
uses  the  information  she  wishes  to  transmit  to  Bob  (at  rate  R  = 
( C  -  zo)(l  -  A))  as  the  input  to  the  encoder  of  the  Shared 
Secret  algorithm.  The  output  of  this  step  is  the  6  x  n(l  —  A) 
submatrix  X\ .  Here  A  is  a  parameter  that  enables  Alice  to  trade 
between  the  probability  of  error  and  rate  loss. 

The  second  submatrix  X2 .  which  we  call  the  secrecy  matrix , 
is  analogous  to  the  secret  S  used  in  the  Secret  Sharing  algorithm 
described  in  Section  VI.  The  size  of  X2  is  6  x  n  A.  In  fact,  X2  is 
an  encoding  of  the  secret  S  Alice  generates  in  the  Shared  Secret 
algorithm.  The  7  =  (6(6  -f  zq)  +  1)(6  +  1)  symbols  corre¬ 
sponding  to  the  parity  symbols  {rj}  and  the  hash  matrix  II  are 
written  in  the  form  of  a  length-7  column  vector.  This  vector  is 
appended  with  symbols  chosen  uniformly  at  random  from 
to  result  in  the  length-(C  -  zo  -  6)nA  vector  U  .  Alice  multi¬ 
plies  U  by  a  random  square  matrix  to  generate  the  input  U.  This 
vector  U  functions  as  the  input  to  the  Omniscient  Adversary  al¬ 
gorithm  operated  over  a  packet-size  nA  with  a  probability  of 
decoding  error  that  is  exponentially  small  in  nA.  The  output  of 
this  step  is  X2. 

The  following  claim  ensures  that  S  is  indeed  secret  from 
Calvin. 

Claim7 :  Let7  =  (b(b+ zo)  + l)(b +1).  The  probabWliy  that 
Calvin  guesses  S  correctly  is  at  most  qr"7,  i.e.,  S  is  information- 
theoretically  secret  from  Calvin. 

The  proof  of  Claim  7  follows  from  a  direct  extension  of  the 
secure  communication  scheme  of  [6]  to  our  scenario. 

The  two  components  of  A\  i.e.,  A"i  and  X2,  respectively,  cor¬ 
respond  to  the  information  Alice  wishes  to  transmit  to  Bob,  and 
an  implementation  of  the  low-rate  secret  channel.  The  fraction 
of  the  packet  size  corresponding  to  A" 2  is  “small,1'  i.e.,  A.  Fi¬ 
nally,  Alice  implements  the  classical  random  encoder  described 
in  Section  III-B. 

B.  Bob's  Decoder 

Bob  arranges  his  received  packets  into  the  matrix 
Y  =  [Yi  Y2\.  The  submatrices  Y\  and  Y2  are,  respectively,  the 
network  transforms  of  Arj  and  AY 

Bob  decodes  in  two  steps.  Bob  first  recovers  S  by  decoding 
Y2  as  follows.  He  begins  by  using  the  Omniscient  Adversary 
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TABLE  It 

COMPARISON  OF  OUR  THREE  ALGORITHMS 


^Adversarial 

Strength 

Rale  ‘ 

Complexity 

Shared 

Secret 

zo  <  C, 

ij  =  network 

C  —  zo 

0(  nC^ 

Omniscient 

ZO  <  C/2, 
zt  =  network 

C  —  2  zo 

0((nC)a) 

Limiled 

zt+2z0  <  C 

C  —  Zo 

0(nC3) 

—  -  /  — 

decoder  to  obtain  the  vector  U .  He  then  obtains  U  from  {/,  by 
inverting  the  mapping  specified  in  Alice’s  encoder.  He  finally 
extracts  from  U  the  7  symbols  corresponding  to  S. 

Alice  has  now  shared  S  with  Bob.  Bob  uses  S  as  the  side  in¬ 
formation  used  by  the  decoder  of  the  Shared  Secret  algorithm 
to  decode  Y\.  This  enables  him  to  recover  AY  which  contains 
Alice’s  information  at  rate  R  =  C  —  zo •  The  probability  of 
error  is  dominated  by  the  sums  of  the  probabilities  of  error  in 
Theorems  1  and  2,  with  the  parameter  n  replaced  by  nA.  The 
Limited  Adversary  algorithm  is  essentially  a  concatenation  of 
the  Shared  Secret  algorithm  with  the  Omniscient  Adversary  al¬ 
gorithm,  thus,  the  computational  cost  is  dominated  by  the  sum 
of  the  two  (with  nA  replacing  n).  Choosing  A  appropriately 
(say  nA  =  n*/3),  one  may  bound  the  complexity  by  0(nC3). 

IX.  Conclusion 

Random  network  codes  are  vulnerable  to  Byzantine  adver¬ 
saries.  This  work  makes  them  secure.  We  provide  algorithms2 
which  are  information-theoretically  secure  and  rate-optimal  for 
different  adversarial  strengths  (as  shown  in  Table  II).  When 
the  adversary  is  omniscient,  we  show  how  to  achieve  a  rate  of 
C  —  2 zo,  where  zo  is  the  number  of  packets  the  adversary  in¬ 
jects  and  C  is  the  network  capacity.  If  the  adversary  cannot  ob¬ 
serve  everything,  our  algorithms  achieve  a  higher  rate,  C  -  zo. 
Both  rates  are  optimal.  Further,  our  algorithms  are  practical; 
they  are  distributed,  have  polynomial-time  complexity,  and  re¬ 
quire  no  changes  at  the  internal  nodes. 

References 

( I]  R.  Ahlswede,  N.  Cai,  S.  R.  Li, and  R.  W  Yeung.  “Network  information 
flow,**  IEEE  Trans.  Inf  Theory ,  vol.  46.  no.  5.  pp.  1 204-12 1 6,  JuL  2000. 

121  N.  Cai  and  R.  W  Yeung.  '‘Secure  network  coding."  in  Prvc.  IEEE  InL 
Symp.  Information  Theory,  Lausanne,  Switzerland,  JunZJul.  2002,  p. 
323. 

[3]  N.  Cai  and  R  W  Yeung.  “Network  error  correction.  Part  2:  Lower 
bounds."  Cornmun .  Inf  and  Syst. ,  vol.  6,  no.  1,  pp.  37-54,  Jan.  2006. 

14]  D.  Charles,  K.  Jain,  and  K.  Lauter,  “Signatures  for  network  coding," 
in  Proc.  40th  Annu.  Conf  Information  Science  and  Systems,  Princeton, 
NJ,  Mar.  2006. 

(5)  D.  Dolev,  C.  Dwork,  O.  Waarts,  and  M.  Yung,  “Perfectly  secure  mes¬ 
sage  transmission."  J.  Assoc.  Comput.  Mach.,  vol.  40,  no.  I,  pp.  17-47, 
Jan.  1993. 

(6]  J.  Feldman,  T.  Malkin,  C.  Stein,  and  R.  A.  Servedio.  “On  the  capacity 
of  secure  network  coding,’*  in  Prvc.  42nd  Annu.  Allerton  Conf  Com¬ 
munication,  Control,  and  Computing,  Monticello,  1L,  Sep.  2004 

[7J  C.  Fragouli  and  E.  Soljanin,  Network  Coding  Fundamentals. 
PLEASE  CITE  LOCATION  OF  PUBLISHER.;  Now,  2007. 

18]  C.  Gkantsidis  and  P.  Rodriguez,  “Network  coding  for  large  scale  con¬ 
tent  distribution,"  in  Prvc .  IEEE  Conf.  Computer  Communications  (IN- 
FOCOMl  Miami,  FL,  Mar.  2005. 

2 A  refinement  of  some  of  the  algorithms  in  this  work  can  be  found  in  [13]. 


Authorized  licensed  use  limited  to:  Muriel  Medard  Downloaded  on  February  26,  2009  at  21 :46  from  IEEE  Xplore.  Restrictions  apply. 


JAGGI  et  al  RESILIENT  NETWORK  CODING  IN  THE  PRESENCE  OF  BYZANTINE  ADVERSARIES 


2603 


[9]  C  Gkantsidis  and  P.  Rodriguez,  “Cooperative  security  for  network 
coding  file  distribution,**  in  Proc.  IEEE  Conf.  Computer  Communica¬ 
tions  (INFOCOM),  Barcelona,  Spain,  Apr.  2006. 

[10]  T.  Ho,  R  Koetter,  M,  Mcdard,  D.  Karger,  and  M.  Effros,  “The  benefits 
of  coding  over  routing  in  a  randomized  setting,**  in  Proc.  IEEE  Int . 
Symp.  Information  Theory ,  Yokohama,  Japan,  JunVJul.  2003,  p.  442. 

[11]  T.  Ho,  M.  Mcdard,  J.  Shi.  M.  EfTros,  and  D.  Karger,  “On  randomized 
network  coding,**  in  Proc .  4 1st  Annu.  Allerton  Conf  Communication , 
Control,  and  Computing,  Monticello,  1L,  Oct.  2003. 

[12]  T.  C.  Ho,  B  Leong.  R  Koetter,  M.  Mcdard,  M  EITros.  and  D  R 
Karger,  “Byzantine  modification  detection  in  multicast  networks  using 
randomized  network  coding,**  in  Proc.  IEEE  Int.  Symp.  Information 
Theory ,  Chicago,  1L,  Jun  2004,  p.  144. 

[13]  S.  Jaggi  and  M.  Langberg,  “Resilient  network  coding  in  the  presence 
of  eavesdropping  byzantinc  adversaries,**  in  Proc.  IEEE  Int.  Symp.  In¬ 
formation  Theory,  Nice,  France,  Jun.  2007,  pp.  541-545. 

[  14J  S.  Jaggi,  M.  Langberg.  T.  Ho,  and  M.  EfTros.  “Correction  of  adversarial 
errors  in  netw  orks,**  in  Proc.  IEEE  Int.  Symp.  Information  Theory ,  Ade¬ 
laide,  Australia,  Sep.  2005,  pp.  1455-1459. 

[15]  S.  Jaggi,  P.  Sanders,  P.  A.  Chou,  M.  EfTros,  S.  Egner,  K.  Jain,  and 
L.  Tolhuizen,  “Polynomial  time  algorithms  for  multicast  network  code 
construction,’*  IEEE  Trans.  Inf  Theory,  vol.  51,  no.  6,  pp.  1973-1982, 
Jun.  2005. 

[16]  A.  Jiang,  “Network  coding  for  joint  storage  and  transmission  with  min¬ 
imum  cost,”  in  Proc.  IEEE  Int.  Symp.  Information  Theory,  Seattle,  WA, 
Jul.  2006,  pp.  1359-1363. 

[17]  S.  Katti.  D.  Katabi,  W.  Hu.  H.  S.  Rahul,  and  M.  Mcdard,  “The  impor¬ 
tance  of  being  opportunistic:  Practical  network  coding  for  wireless  en¬ 
vironments.'*  in  Proc.  43rd  Annu.  Allerton  Conf.  Communication.  Con- 
tml,  and  Computing ,  Monticello,  1L,  Sep.  2005. 

[18]  S.  Katti.  H.  Rahul.  D.  Katabi.  W.  H.  M.  Mcdard.  and  J.  Crowcroft. 
“XORs  in  the  air  Practical  wireless  network  coding,”  in  Pare.  ACM 
SIGCOMM ,  Pisa,  Italy.  Sep.  2006. 

[19]  R.  Koetter  and  F.  Kschischang,  “Coding  for  errors  and  erasures  in 
random  network  coding,**  in  Proc.  IEEE  Int.  Symp.  Information  Theory, 
Nice,  France,  June  2007,  pp.  791-795. 


[20]  R.  Koetter  and  M.  M6dard,  ’‘Beyond  routing:  An  algebraic  approach  to 
network  coding,”  in  Proc.  21  st  Annu.  Joint  Conf  e  IEEE  Computer  and 
Commiuxications  Societies  ( INFOCOM).  2002,  vol.  1,  pp.  122-130. 

[21]  R.  Koetter  and  M  Mcdard,  “An  algebraic  approach  to  network  coding,” 
I  EE  E/ACM  Trans.  Net w.,  vol.  1 1.  no.  5,  pp.  782-795,  Oct.  2003 

[22]  M.  N.  Krohn,  M.  J.  Freedman,  and  D.  Mazires,  “On-the-fly  verifica¬ 
tion  of  rateless  erasure  codes  for  efficient  content  distribution,”  in  Proc. 
IEEE  Symp.  Security  and  Privacy,  Oakland,  CA,  May  2004. 

[23]  S.-Y  R.  Li.  R.  W.  Yeung,  and  N.  Cai,  ’‘Linear  network  coding,”  IEEE 
Trans.  Inf  Theory ,  vol.  49,  no.  2,  pp.  371-381.  Feb.  2003. 

[24]  D.  Lun,  M.  Mddard,  T.  Ho,  and  RL.  Koetter,  “On  network  coding  with 
a  cost  criterion,”  in  Proc.  IEEE  Int.  Symp.  Information  Theory  and  Its 
Applications ,  Parma,  Italy,  Oct.  2004. 

[25]  D.  S.  Lun.  M.  Mcdard,  and  R  Koetter.  “Efficient  operation  of  wire¬ 
less  packet  networks  using  network  coding,”  in  Proc.  Int.  Workshop  on 
Comergcnt  Technologies  (TWCT),  Oulu,  Finland,  Jun.  2005. 

[26]  A.  Pelc  and  D.  Pcleg,  “Broadcasting  with  locally  bounded  byzantinc 
faults,”  Inf  Process.  Lett.,  vol.  93,  no.  3,  pp,  109-1 15,  Feb.  2005. 

[27]  L.  Subramanian,  “Decentralized  Security  Mechanisms  for  Routing 
Protocols."  PhD  dissertation,  Univ.  Calif.  Berkeley,  Computer  Sci¬ 
ence  Division,  Berkeley,  CA,  2005. 

[28]  J.  E.  Wiesclthier,  G  D  Nguyen,  and  A.  Ephremides,  “On  the  construc¬ 
tion  of  energy-efficient  broadcast  and  multicast  trees  in  wireless  net¬ 
works,”  in  Proc.  IEEE  Infocom,  2000,  vol.  2,  pp.  585-594. 

[29]  R.  W.  Yeung  and  N.  Cai,  “Netw  ork  enor  correction,  part  1 :  Basic  corn 
cepts  and  upper  bounds,"  Commun.  Inf.  and  Syst.,  vol.  6,  no.  1,  pp. 
19-36.  Jan.  2006. 

[30]  R.  W.  Yeung.  S.  Li,  N.  Cal,  and  Z.  Zhang.  Network  Coding  Theory. 
Amsterdam.  The  Netherlands:  Now,  2006. 

[31]  Z.  Zhang,  “Linear  network  error  correction  codes  in  packet  networks," 
IEEE  Tmns.  Inf  Theory,  vol.  54,  no.  1,  pp.  209-218,  Jan.  2008. 

[32]  F.  Zhao,  T.  Kalker,  M.  Mddard,  and  K.  J.  Han,  “Signatures  for  content 
distribution  with  network  coding.”  in  Proc.  IEEE  Int.  Symp.  Informa¬ 
tion  Theory ,  Nice.  France,  Jun.  2007,  pp.  556-560. 


Authorized  licensed  use  limited  to  Muriel  Medard.  Downloaded  on  February  26.  2009  at  21  46  from  IEEE  Xptore  Restrictions  apply 


2798 

Byzantine  Modification  Detection  in  Multicast  Networks 
With  Random  Network  Coding 

Tracey  Ho,  Member,  IEEE ,  Ben  Leong, 

Ralf  Koetter,  Senior  Member,  IEEE,  Muriel  Medard,  Fellow ;  IEEE, 
Michelle  Effros,  Senior  Member.  IEEE,  and 
David  R.  Karger,  Associate  Member,  IEEE 


Abstract — An  Information-theoretic  approach  for  detecting  Byzantine  or 
adversarial  modifications  in  networks  employing  random  linear  network 
coding  is  described.  Each  exogenous  source  packet  is  augmented  with  a 
flexible  number  of  hash  symbols  that  arc  obtained  as  a  polynomial  func¬ 
tion  of  the  data  symbols.  This  approach  depends  only  on  the  adversary  not 
knowing  the  random  coding  coefficients  of  all  other  packets  received  by  the 
sink  nodes  when  designing  its  adversarial  packets.  We  show  how  the  detec¬ 
tion  probability  varies  with  the  overhead  (ratio  of  hash  to  data  symbols), 
coding  field  size,  and  the  amount  of  information  unknow  n  to  the  adversary* 
about  the  random  code. 

Index  Terms — Byzantine  adversary,  multicast,  network  coding,  network 
error  detection. 


I.  Introduction 

Wc  consider  the  problem  of  information-theoretic  detection  of 
Byzantine,  i.e.,  arbitrary,  modifications  of  transmitted  data  in  a  net¬ 
work  coding  setting. 

Interest  in  network  coding  has  grown  following  demonstrations  of 
its  various  advantages:  in  network  capacity  [1],  robustness  to  noncr- 
godic  network  failures  [2]  and  crgodic  packet  erasures  [3],  [4],  and 
distributed  network  operation  [5].  Multicast  in  overlay  and  ad  hoc  net¬ 
works  is  a  promising  application.  Since  packets  are  forwarded  by  end 
hosts  to  other  end  hosts,  such  networks  are  susceptible  to  Byzantine 
errors  introduced  by  compromised  end  hosts. 

We  show  that  Byzantine  modification  detection  capability  can  be 
added  to  a  multicast  scheme  based  on  random  linear  block  network 
coding  [5],  [6],  with  modest  additional  computational  and  communica¬ 
tion  overhead,  by  incorporating  a  simple  polynomial  hash/chcck  value 
in  each  packet.  With  this  approach,  a  sink  node  can  detect  Byzantine 
modifications  with  high  probability,  as  long  as  these  modifications  have 
not  been  designed  with  knowledge  of  the  random  coding  combinations 
present  in  all  other  packets  obtained  at  the  sink:  the  only  essential  con¬ 
dition  is  the  adversary’s  incomplete  knowledge  of  the  random  network 
code  seen  by  the  sink.  No  other  assumptions  are  made  regarding  the 
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topology  of  the  network  or  the  adversary’s  power  to  corrupt  or  inject 
packets.  The  adversary  can  know  the  entire  message  as  well  as  por¬ 
tions  of  the  random  network  code,  and  can  have  the  same  (or  greater) 
transmission  capacity  compared  to  the  source.  This  approach  works 
even  in  the  extreme  case  where  every  packet  received  by  a  sink  has 
been  corrupted  by  being  coded  together  with  an  independent  adver¬ 
sarial  packet.  This  new  adversarial  model  may  be  useful  for  applica¬ 
tion  scenarios  in  which  conventional  assumptions  of  an  upper  bound 
on  adversarial  transmission  capacity  are  less  appropriate.  For  instance, 
in  some  peer-to-peer  or  wireless  ad  hoc  settings  wc  may  not  know  how 
many  adversarial  nodes  might  join  the  network,  while  it  may  be  more 
likely  that  there  will  be  some  transmissions  that  are  not  received  by  the 
adversarial  nodes.  In  such  cases,  our  approach  can  provide  a  useful  al¬ 
ternative  to  existing  methods. 

Our  approach  provides  much  flexibility  in  trading  off  between  the 
detection  probability,  the  proportion  of  redundancy,  the  coding  field 
size,  and  the  amount  of  information  about  the  random  code  that  is  not 
observed  by  the  adversary.  This  approach  can  be  used  for  low  overhead 
monitoring  during  normal  conditions  w  hen  no  adversary  is  known  to  be 
present,  in  conjunction  with  more  complex,  higher  overhead  techniques 
w  hich  are  activated  upon  detection  of  a  Byzantine  error,  such  as  adding 
more  redundancy  for  error  correction. 

A  preliminary  version  of  this  work  with  less  general  assumptions 
appeared  in  [7],  The  security  model  is  substantially  generalized  and 
strengthened  in  this  work. 

A.  Background  and  Related  Work 

The  problem  of  secure  network  communications  in  the  presence  of 
Byzantine  adversaries  has  been  studied  extensively,  c.g.,  [8]-[1 1].  A 
survey  of  information-theoretic  research  in  this  area  is  given  in  [12]. 
Two  important  issues  are  secrecy  and  authenticity;1  this  work  concerns 
the  latter.  Like  one-time  pads  [13],  our  approach  relics  on  the  genera¬ 
tion  of  random  values  unknown  to  the  adversary,  though  the  one-time 
pad  provides  secrecy  and  not  authenticity. 

In  the  network  coding  context,  the  problem  of  ensuring  secrecy  in 
the  presence  of  a  wiretap  adversary  has  been  considered  in  [14]-[16]. 
The  problem  of  correcting  adversarial  errors,  which  is  complementary 
to  our  work,  has  been  studied  in  [  1 7J-[2 1  ]. 

Adversarial  models  in  existing  works  on  information-theoretic 
authenticity  techniques  commonly  assume  some  upper  bound  on  the 
number  of  adversarial  transmissions,  which  leads  to  a  requirement 
on  the  amount  of  redundant  network  capacity.  For  the  problem  of 
adversarial  error  correction  or  resilient  communication,  the  number 
of  links/transmissions  controlled  by  the  adversary  must  necessarily 
be  limited  with  respect  to  the  number  of  links/transmissions  in  a 
minimum  source-sink  cut  or  the  amount  of  redundancy  transmitted 
by  the  source.  For  instance,  in  the  resilient  communication  problem 
of  Dolcv  et  al.  [9],  the  source  and  sink  are  connected  by  71  wires,  and 
their  model  requires  that  no  more  than  {n  -  l)/2  wires  are  disrupted 
by  an  adversary  for  resilient  communication  to  be  possible.  In  the 
network  coding  error  correction  problems  of  [17],  [20],  [21],  the  rate 
of  redundant  information  that  the  source  needs  to  transmit  is  between 
one  and  two  limes  the  maximum  rate  of  information  that  can  be 
injected  by  the  adversary,  depending  on  the  specific  adversarial  model. 

The  above  techniques  can  also  be  considered  in  the  context  of  error 
detection.  For  example,  in  one  phase  of  the  secret  sharing  based  algo¬ 
rithm  in  [9],  the  source  communicates  a  degree  r  polynomial  f(jr)  € 
Fg(x)  by  sending  /(i)  on  the  ith  wire.  If  the  adversary  controls  at  most 
n  -  r  wires,  any  errors  it  introduces  can  be  detected.  In  general,  for 
approaches  based  on  error-correcting  codes  such  as  in  [17],  the  number 

'These  are  independent  attributes  of  a  cryptographic  system  [13]. 
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of  adversarial  errors  that  can  be  detected  is  given  by  the  difference  be¬ 
tween  the  source-sink  minimum  cut  and  the  source  information  rate. 

Such  approaches  have  a  threshold  nature  in  that  they  do  not  offer 
graceful  performance  degradation  when  the  number  of  adversarial 
transmissions  exceeds  the  assumed  upper  bound.  Their  efficiency  is 
also  sensitive  to  overestimates  of  adversarial  transmission  capacity, 
which  determines  the  amount  of  redundancy  required. 

The  adversarial  model  considered  in  this  work  is  slightly  different. 
Instead  of  assuming  a  limit  on  the  number  of  adversarial  errors,  our 
only  assumption  is  on  the  incompleteness  of  the  adversary's  knowl¬ 
edge  of  the  random  code  (the  adversary  can  know  the  entire  source 
message).  In  this  case,  the  overhead  (proportion  of  redundant  infor¬ 
mation  transmitted  by  the  source)  is  no  longer  a  function  of  the  esti¬ 
mated  upper  bound  on  the  number  of  adversarial  errors.  Instead,  it  is 
a  design  parameter  which,  as  we  will  show,  can  be  flexibly  traded  off 
against  detection  probability  and  coding  field  size.  Unlike  approaches 
based  on  secret  sharing  and  its  variants,  where  the  required  proportional 
overhead  is  a  function  of  the  adversarial  strength,  in  our  approach,  for 
any  nonzero  proportional  overhead  and  any  adversarial  strength  short 
of  full  knowledge  or  control  of  network  transmissions,  the  detection 
probability  can  be  made  arbitrarily  high  by  increasing  the  field  size. 
The  former  has  the  advantage  of  deterministic  guarantees,  while  our 
approach  has  the  advantage  of  greater  flexibility  with  additional  per¬ 
formance  parameters  that  can  be  traded  off  against  one  another. 

The  use  of  our  error  detection  technique  for  low-overhead  mon¬ 
itoring  under  normal  conditions  when  no  adversary  is  known  to  be 
present,  in  conjunction  with  a  more  complex  technique  activated  upon 
detection  of  an  adversary,  has  a  parallel  in  works  such  as  [22]  and  |23]. 
These  works  optimize  for  normal  conditions  by  using  less  complex 
message  authentication  codes  and  signed  digests,  respectively,  during 
normal  operation,  resorting  to  more  complex  recovery  mechanisms 
only  upon  detection  of  a  fault. 

B.  Notation 

In  this  work,  we  denote  matrices  with  bold  uppercase  letters  and  vec¬ 
tors  with  bold  lowercase  letters.  All  vectors  are  row  vectors  unless  in¬ 
dicated  otherwise  with  a  subscript  T.  We  denote  by  [ x .  y]  the  concate¬ 
nation  of  two  row  vectors  x  and  y 

ll.  Model  and  Problem  Formulation 

Consider  random  linear  block  network  coding  [5],  [6],  [24]  of  a  block 
of  r  exogenous  packets  which  originate  at  a  source  node  and  are  mul¬ 
ticast  to  one  or  more  sink  nodes.  We  assume  that  the  network  coding 
subgraph  is  given  by  some  separate  mechanism,  the  details  of  which 
we  are  not  concerned  with.2  An  adversary  observes  some  subset  of 
packets  transmitted  in  the  network,  and  can  corrupt,  insert  or  delete  one 
or  more  packets,  or  corrupt  some  subset  of  nodes.  The  only  assump¬ 
tion  we  make  is  that  the  adversary's  observations  are  limited  such  that 
when  designing  the  adversarial  packets,  the  adversary  does  not  know 
the  random  coding  combinations  present  in  all  other  packets  obtained 
at  the  sinks.  This  assumption  is  made  precise  using  the  notion  of  secret 
packets  which  we  define  below.  The  source  and  sinks  do  not  share  any 
keys  or  common  information. 

Each  packet  p  in  the  network  is  represented  by  a  row  vector  wr  of 
(f  +  r+  r  symbols  from  a  finite  field  F*.  where  the  first  d  entries 
are  data  symbols,  the  next  r  are  redundant  hash  symbols,  and  the  last 
r  form  the  packet's  (global)  coefficient  vector  tP.  The  field  size  is  2 
to  the  power  of  the  symbol  length  in  bits.  The  hash  symbols  in  each 
exogenous  packet  are  given  by  a  function  :  F*  -*  Fq  of  the  data 

^e  network  coding  subgraph  defines  lhc  times  at  which  packets  are  or  can 
be  transmitted  on  each  network  link  (see,  e.g.,  [25]). 
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symbols.  The  coefficient  vector  of  the  ?th  exogenous  packet  is  the  unit 
vector  with  a  single  nonzero  entry  in  the  ith  position.  The  coefficient 
vectors  are  used  for  decoding  at  the  sinks  as  explained  below. 

Each  packet  transmitted  by  the  source  node  is  an  independent 
random  linear  combination  of  the  r  exogenous  packets,  and  each 
packet  transmitted  by  a  nonsource  node  is  an  independent  random 
linear  combination  of  packets  received  at  that  node.  The  coeffi¬ 
cients  of  these  linear  combinations  are  chosen  with  the  uniform 
distribution  from  the  finite  field  F,,  and  the  same  linear  operation 
is  applied  to  each  symbol  in  a  packet.  For  instance,  if  packet  pa  is 
formed  as  a  random  linear  combination  of  packets  pi  and  p 2,  then 
tVj  =  7i,3 4-  7?.3*tfp2  where  71,3  and  7 2.3  are  random  scalar 
coding  coefficients  distributed  uniformly  over  Fq. 

Let  row  vector  m,  £  f[c+<1)  represent  the  concatenation  of  the  data 
and  hash  symbols  for  the  ith  exogenous  packet,  and  let  M  be  the  ma¬ 
trix  whose  ith  row  is  m, .  A  packet  p  is  genuine  if  its  data/hash  symbols 
arc  consistent  with  its  coefficient  vector,  i.e.,  wP  =  [fpM,  fp).  The  ex¬ 
ogenous  packets  arc  genuine,  and  any  packet  formed  as  a  linear  com¬ 
bination  of  genuine  packets  is  also  genuine.  Adversarial  packets ,  i.e., 
packets  transmitted  by  the  adversary,  may  contain  arbitrary  coefficient 
vector  and  data/hash  values.  An  adversarial  packet  p  can  be  represented 
in  general  by  [tPM  4-  vpf  fp],  where  vp  is  an  arbitrary  vector  F^rf.  If 
vp  is  nonzero,  p  (and  linear  combinations  of  p  with  genuine  packets) 
are  nongenuine. 

A  set  5  of  packets  can  be  represented  as  a  block  matrix 
[TsAf  4-  V\s|T,s]  whose  ith  row  is  wPt  where  p,  is  the  ith  packet 
of  the  set.  A  sink  node  t  attempts  to  decode  when  it  has  collected  a 
decoding  set  consisting  of  r  linearly  independent  packets  (i.e.,  packets 
whose  coefficient  vectors  are  linearly  independent).  For  a  decoding 
set  P,  the  decoding  process  is  equivalent  to  premulliplying  the  matrix 
[Tv M  -f  Vv\Tv]  with  Tp\  This  gives  [M  4*  Tvx  Vv\I\,  i.e.,  the 
receiver  decodes  toM  +  M ,  where 

M  =  TvlVv  (1) 

gives  the  disparity  between  the  decoded  packets  and  the  original 
packets.  If  at  least  one  packet  in  a  decoding  set  is  nongenuine. 
Vv  7^  0,  and  the  decoded  packets  will  differ  from  the  original 
packets.  A  decoded  packet  is  inconsistent  if  its  data  and  hash  values 
do  not  match,  i.e.,  applying  the  function  v<t  to  its  data  values  docs  not 
yield  its  hash  values.  If  one  or  more  decoded  packets  are  inconsistent, 
the  sink  declares  an  error. 

The  coefficient  vector  of  a  packet  transmitted  by  the  source  is  uni¬ 
formly  distributed  over  FJ;  if  a  packet  whose  coefficient  vector  has 
this  uniform  distribution  is  linearly  combined  with  other  packets,  the 
resulting  packet's  coefficient  vector  has  the  same  uniform  distribution. 
We  are  concerned  with  the  distribution  of  decoding  outcomes  condi¬ 
tioned  on  the  adversary’s  information,  i.e.,  the  adversary’s  observed 
and  transmitted  packets,  and  its  information  on  independcncies/dcpen- 
dencies  among  packets.  Note  that  in  this  setup,  scaling  a  packet  by  some 
scalar  element  of  F,  docs  not  change  the  distribution  of  decoding  out¬ 
comes. 

For  given  Af,  the  value  of  a  packet  p  is  specified  by  the  row  vector 
up  =  [fp,vp].  We  call  a  packet  p  secret  if,  conditioned  on  the  value 
of  vp  and  the  adversary’s  information,  its  coefficient  vector  tP  is  uni¬ 
formly  distributed  over  F£\IF  for  some  (possibly  empty)  subspace 
or  affine  space  IV*  C  FJ.*  Intuitively,  secret  packets  include  genuine 
packets  whose  coefficient  vectors  are  unknown  (in  the  above  sense) 

3This  definition  of  a  secret  packet  is  conservative  as  it  docs  not  distinguish 
between  packers  with  a  nonuniform  conditional  distribution  and  packets  that  are 
fully  known  to  the  adversary.  Taking  this  disiinction  imo  accoum  would  make 
the  analysis  more  complicated  bui  would  in  some  cases  give  a  better  bound  on 
detection  probability. 
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lo  the  adversary,  as  well  as  packets  formed  as  linear  combinations  in¬ 
volving  at  least  one  secret  packet.  A  set  5  of  secret  packets  is  se¬ 
crecy-independent  if  each  of  the  packets  remains  secret  when  the  ad¬ 
versary  is  allowed  to  observe  the  other  packets  in  the  set;  otherwise 
it  is  secrecy-dependent.  Secrecy-dependencies  arise  from  the  network 
transmission  topology,  for  instance,  if  a  packet  p  is  formed  as  a  linear 
combination  of  a  set  S  of  secret  packets  (possibly  with  other  nonsecret 
packets),  then  «S  U  {/>}  is  secrecy-dependent. 

To  illustrate  these  definitions,  suppose  that  the  adversary  knows  that 
a  sink’s  decoding  set  contains  an  adversarial  packet  p\  as  well  as  a 
packet  pa  formed  as  some  linear  combination  k2wP2  -f  of  an 

adversarial  packet  p2  with  a  genuine  packet  j>3,  so  the  adversary  knows 
tPJ  ’tpi  *  vpi  *  Vps  andvpj  =  0.  Since  a  decoding  set  consists  of  packets 
with  linearly  independent  coefficient  vectors,  the  adversary  knows  that 
tp ,  and  arc  linearly  independent.  Suppose  also  that  the  adversary 
does  not  observe  the  contents  of  any  packets  dependent  on/?3-  Thus,  the 
distribution  of  tP4 ,  conditioned  on  the  adversary’s  information  and  any 
potential  value  k2VPl  forvP4,  is  uniform  over  FJ\{Atp,  :  A  E  F7}. 
Also,  packets  pz  and  p4  are  secrecy-dependent. 

Consider  a  decoding  set  V  containing  one  or  more  secret  packets. 
Choosing  an  appropriate  packet  ordering,  wc  can  express  [Tv\Vv]  in 
the  form 


A  +  B 

Vll 

[Tv\Vv\  = 

C  A  4-  B% 

V? 

B 3 

v3_ 

where  for  any  given  values  of  B  E  FJ**r,  Vi  E  F7<x  i  = 
1,2,3,  and  C  E  FJ3**',  the  matrix  A  E  F7|Xr  has  a  conditional 
distribution  that  is  uniform  over  all  values  for  which  Tp  is  nonsingular. 
The  first  s\  rows  correspond  to  secret  packets,  and  the  first  s\  rows 
correspond  to  a  set  of  secrecy-independent  packets.  s2  =  0  if  there  are 
no  secrccy-dcpendcncics  among  the  secret  packets  in  V. 

This  notion  of  secret  packets  provides  the  most  general  characteri¬ 
zation  of  the  conditions  under  which  the  scheme  succeeds.  For  a  given 
network  topology,  a  requirement  on  the  number  of  secrecy- independent 
secret  packets  received  at  the  sink  can  be  translated  into  constraints  on 
the  subsets  of  links/packcts  the  adversary  can  observe  and/or  modify. 
For  instance,  if  information  is  sent  on  n  parallel  paths  from  a  source  to 
a  sink  node,  then  the  number  of  secrecy-independent  secret  packets  is 
the  number  of  linearly  independent  packets  received  on  paths  that  are 
not  observed  or  controlled  by  the  adversary. 

Note  that  wc  allow  each  packet  of  the  decoding  set  to  be  corrupted 
with  an  independent  adversarial  packet,  as  long  as  at  least  one  of  the 
packets  has  been  formed  as  a  linear  combination  with  some  secret 
packet. 

ill.  Main  Results 

In  the  following  theorem,  wc  consider  decoding  from  a  set  of 
packets  that  contains  some  nongenuine  packet,  which  causes  the 
decoded  packets  to  differ  from  the  original  exogenous  packets.  The 
first  part  of  the  theorem  gives  a  lower  bound  on  the  number  of  equally 
likely  potential  values  of  the  decoded  packets — the  adversary  cannot 
narrow  down  the  set  of  possible  outcomes  beyond  this  regardless  of 
how  it  designs  its  adversarial  packets.  The  second  part  provides,  for  a 
simple  polynomial  hash  function,  an  upper  bound  on  the  proportion 
of  potential  decoding  outcomes  that  can  have  consistent  data  and  hash 
values,  in  terms  of  A  =  [~],  the  ceiling  of  the  ratio  of  the  number 
of  data  symbols  to  hash  symbols.  Larger  values  for  k  correspond 
to  lower  overheads  but  lower  probability  of  detecting  an  adversarial 
modification.  This  tradeoff  is  a  design  parameter  for  the  network. 

Theorem  I:  Consider  a  decoding  set  V  containing  a  secrecy-inde¬ 
pendent  subset  of  secret  (possibly  nongenuine)  packets,  and  suppose 
the  decoding  set  contains  at  least  one  nongenuine  packet. 


a)  The  adversary  cannot  determine  which  of  a  set  of  at  least  (q  —  1  )** 
equally  likely  values  of  the  decoded  packets  will  be  obtained  at  the 
sink  In  particular,  there  will  be  at  least  a \  packets  such  that,  for  each 
of  these,  the  adversary  cannot  determine  which  of  a  set  of  at  least  q  -  1 
equally  likely  values  w  ill  be  obtained. 

b)  Let  v  :  Fj  — ♦  F7  be  the  function  mapping  (j-j . n),  j,  E 

F„  to 

t/'(-n . *k  )  =  j2\  + - V  -r*+1  (3) 


where  k  =  [-].  Suppose  the  function  K'd  mapping  the  data  symbols 
r\ , . . . ,  xd  lo  the  hash  symbols  y\ , . . . ,  ye  in  an  exogenous  packet  is 
defined  by 

i/.  . *,*),  Vi  =  1 . r-  1 


i/r  =t>(x(e_u*+i . rd). 


Then  the  probability  of  not  detecting  an  error  is  at  most 


Corollary  I:  Let  the  hash  function  xi)d  be  defined  as  in  Theorem  1  b. 
Suppose  a  sink  obtains  more  than  r  packets,  including  a  secrecy-inde¬ 
pendent  set  of  s  secret  packets,  and  at  least  one  nongenuine  packet.  If 
the  sink  decodes  using  two  or  more  decoding  sets  whose  union  includes 
all  its  received  packets,  then  the  probability  of  not  detecting  an  error  is 
at  most  . 


Example:  With  2 %  overhead  (A  =  50),  symbol  length  =  7  bits, 
a  =  5,  the  detection  probability  is  at  least  98.9%;  with  1  %  overhead 
(A  =  100),  symbol  length  =  8  bits,  s  =  5,  the  detection  probability 
is  at  least  99.0%. 


IV.  Development,  Proofs,  and  ancillary  Results 
A.  Vulnerable  Scenarios 

Before  analyzing  the  scenario  described  in  the  previous  sections,  we 
first  point  out  when  this  approach  fails  to  detect  adversarial  modifica¬ 
tions. 

First,  the  sink  needs  some  way  of  knowing  if  the  source  stops  trans¬ 
mitting,  otherwise,  the  assumption  of  no  shared  secret  information  re¬ 
sults  in  the  adversary  being  indistinguishable  from  the  source.  One  pos¬ 
sibility  is  that  the  source  cither  transmits  at  a  known  rate  or  is  inactive, 
and  that  the  sink  knows  at  what  rates  it  should  be  receiving  information 
on  various  subsets  of  incoming  links  when  the  source  is  active.  If  the 
adversary  is  unable  to  reproduce  those  information  rates,  e.g.,  because 
it  docs  not  control  the  same  part  of  the  network  as  the  source,  then  the 
sink  knows  when  the  source  is  inactive. 

Second,  if  the  adversary  knows  that  the  genuine  packets  received  at 
a  sink  have  coefficient  vectors  that  lie  in  some  tr-dimensional  subspace 
C  F7,  the  following  strategy  allows  it  to  control  the  decoding  out¬ 
come  and  so  ensure  that  the  decoded  packets  have  consistent  data  and 
hash  values. 

The  adversary  ensures  that  the  sink  receives  it  genuine  packets  with 
linearly  independent  coefficient  vectors  in  IT,  by  supplying  additional 
such  packets  if  necessary.  The  adversary  also  supplies  the  sink  with 
r  -  w  nongenuine  packets  whose  coefficient  vectors  t\ , . . .  fr-«  are 

not  in  H\  Let  <r-«i>+i . tr  be  a  set  of  basis  vectors  for  H\  and  let 

T  be  the  matrix  whose  i  th  row  is  t  .  Then  the  coefficient  vectors  of  the 
r  packets  can  be  represented  by  the  rows  of  the  matrix 


I 

0 

0 

A' 

where  K  is  a  nonsingular  matrix  in  F7  xur.  From  (5).  we  have 


I 

0 

V 

TM  = 

— 

0 

K 

0 
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m  =  r-‘ 

=  T~l 


Since  the  adversary  knows  T  and  controls  V ,  it  can  determine  Af 


B.  Byzantine  Modification  Detection 

We  next  consider  the  scenario  described  in  Section  11,  where  the  ad¬ 
versary  designs  its  packets  without  knowing  the  contents  of  one  or  more 
secret  packets  the  receiver  will  use  for  decoding,  and  prove  the  results 
of  Section  Ill. 

We  first  establish  two  results  that  are  used  in  the  proof  of  The¬ 
orem  1.  Consider  the  hash  function  defined  in  (3).  We  call  a  vector 
(2-, - ,-r*+i)  e  consistent  if  +  \  =  &(*i . Xk). 

Lemma  1:  At  most  A-  4-  1  out  of  the  q  vectors  in  a  set 

{ti  +  7t>:  7€  F,} 

where  u  =  ( t<  ft . iu+i)  is  a  fixed  vector  in  Fj+1  and  v  = 

(t*i . iM+i )  is  a  fixed  nonzero  vector  in  Fj+I ,  can  be  consistent. 

Proof:  Suppose  some  vector  u  4  *)v  is  consistent,  i.e., 

«*4 1  -f  T-Vm  =  (mi  4-  - f*  +  yvk)k  +  l .  (4) 


Note  that  for  any  fixed  value  of  u  and  any  fixed  nonzero  value  of  u,  (4) 
is  a  polynomial  equation  in  7  of  degree  equal  to  1 4-  A*.  where  k  6  [1.  k] 
is  the  highest  index  for  which  the  corresponding  tv  is  nonzero,  i.e.. 
v-k  ^  0,  tv  =  0  V  k'  >  A*.  By  the  fundamental  theorem  of  algebra, 
this  equation  can  have  at  most  1  4-  k  <  1  -f  k  roots.  Thus,  the  property 
can  be  satisfied  for  at  most  1  4  A*  values  of  7.  □ 

Corollary  2:  Let  11  be  a  fixed  row  vector  in  F  J  and  Y  a  fixed  nonzero 
matrix  in  Fjx(*+I  If  row  vector  g  is  distributed  uniformly  over  FJ, 
then  the  vector  u  -f  gY  is  consistent  with  probability  at  most 

Proof:  Suppose  the  ith  row  of  Y ,  denoted  y, ,  is  nonzero.  We  can 
partition  the  set  of  possible  values  for  g  such  that  each  partition  consists 
of  all  vectors  that  differ  only  in  the  ith  entry  gt.  For  each  partition,  the 
corresponding  set  of  values  of  u  4  is  of  the  form  {u'4<7,  y,  :  g,  € 

Fq } .  The  result  follows  from  Lemma  1  and  the  fact  that  gt  is  uniformly 
distributed  over  Fq.  □ 


Proof  of  Theorem  I:  We  condition  on  any  given  values  of 
=  1,2,3,  and  C  in  (2).  Writing  A  =  44  B}.Tv  becomes 

A 

C(A'-B)  +  B7 

b3 


From  ( 1 ),  we  have 


A' 

—  Bi)  +  Bi 

Af  = 

v3 

b3 

A' 

.Vs. 

Vi 

—CB\  +  Bj 

Af  = 

Vi- 

-cv 

b3 

v3 

which  we  can  simplify  to 


B 


M 


-[E] 


by  writing 


*■-[ 


-~CB  1  ^  B3 

B3 


.  n  = 


V2-CVt 

v3 


(5) 


Since  the  determinant  of  a  matrix  is  not  changed  by  adding  a  multiple  of 


one  row  to  another  row;  and 
of  such  operations,  w'e  have 

r  A 


[i 


is  obtained  from  Tv  by  a  sequence 


I  Bf 


is  nonsingular  &  Tjy  is  nonsingular. 


Thus,  matrix  A '  6  FJX  xr  has  a  conditional  distribution  that  is  uniform 


over  the  set  A  of  values  for  which 


B 


is  nonsingular. 


The  condition  that  the  decoding  set  contains  at  least  one  nongenuine 
packet  corresponds  to  the  condition  Vv  #  0.  We  consider  two  cases. 
In  each  case,  we  show  that  we  can  partition  the  set  A  such  that  at  most 
a  fraction  (  of  values  in  each  partition  give  decoding  outcomes 

Af  4  Af  with  consistent  data  and  hash  values.  The  result  then  follows 
since  the  conditional  distribution  of  values  within  each  partition  is  uni¬ 
form. 

Case  I:  V2  ^  O.  Let  v,  be  some  nonzero  row  of  V\%  and  bt  the 
corresponding  row  of  B  Then  b,  M  =  v, . 

We  first  partition  A  into  cosets 


An  =  {An  4  rTbt  :  r  €  FJ1 }.  »  =  1,2, 


where 


v=a 

q*i 


This  can  be  done  by  the  following  procedure.  Any  element  of  A  can 
be  chosen  as  Ai .  Matrices  A2,  A3, . . . ,  Ax  are  chosen  sequentially; 

for  each  m  =2 . \,  A,n  is  chosen  to  be  any  element  of  A  not  in 

the  cosets  Anin  <  in.  Note  that  this  forms  a  partition  of  A ♦  since  the 
presence  of  some  element  c  in  two  sets  An  and  Am .  «  <  m.  implies 
that  Am  is  also  in  An  ♦  which  is  a  contradiction.  It  is  also  clear  that  each 
cosct  has  size  \  [r  :  r  £  FJ* }  |  =  q* 1 . 

For  each  such  cosct  An ,  the  corresponding  values  of  Xf  satisfy,  from 
(5) 


An+rrb,- 

B' 

A„ 

Af  = 

Vt  -  rTv, 

B' 

V>2  . 

An- 

-1 

\V,  -rrv,] 

!  B  _ 

v i  J 

Af  = 


where  r  6  FJ1.  Let  U  be  the  submatrix  consisting  of  the  first  .*1 
A  1  ~ 1 

.  Since  U  is  full  rank,  we  can  find  a  set  J  C 


columns  of 


B! 


{1 . r}  of  f>i  indices  that  correspond  to  linearly  independent  rows 


oft/.  Let  [U  \  |  U2  ]  be  thc^i  xr  submatrix  of 


\An 

l B ' 


consisting  of 


rows  with  indices  in  J .  Consider  the  corresponding  rows  of  Af  4  Af , 
which  can  be  expressed  in  the  form 


M.i  +1/.V,  —  U \rTv,  +  UiV'i 


(6) 


where  Mj  is  the  submatrix  of  Af  consisting  of  rows  corresponding 
to  set  J.  Since  U 1  is  nonsingular  by  the  choice  of  J ,  U  \  r  takes 
potentially  any  value  in  FJl .  Thus,  the  set  of  potential  values  for  each 
row  of  (6),  for  any  given  value  of  Af  j.  4„,  £f.  VV  V2.  v,t  and  the 
other  rows,  is  of  the  form  {u  4  7V,  :  7  E  F?)  where  u  is  a  function 
of  Af./,  A„,  B\  Vi,  V2.  Applying  Lemma  I  yields  the  result  for  this 
case. 

Case  2:  V2  =  0.  i.e..  V2  -  CV]  =  V 3  =  0.  Then  £  0.  since 
otherwise  V\  =  V2  =  0  and  Vp  =  0  which  would  contradict  the 
assumption  that  there  is  at  least  one  nongenuine  packet. 
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Wc  partition  A  such  that  each  partition  consists  of  all  matrices  in  A 
that  have  the  same  row  space 

A„  =  {RA„:  R  6  FJ1  x*'  .  (let (R)  *  0}  .  »  =  1. 2 . \ 

where 

|X|=  II  (A’'-/).  \=^ 

This  can  be  done  by  choosing  any  element  of  A  as  A\ .  and  choosing 
A,x .  n  =  2, . . . ,  \  sequentially  such  that  An  is  any  element  of  A  not 
in  Amy  m  <  n. 

For  each  An ,  n  =  1, . . . ,  the  corresponding  values  of  M  satisfy, 
from  (5) 


Let  U  be  the  submatrix  consisting  of  the  first  columns  of 

We  can  find  an  ordered  set  3  =  {»' . . i»,  :  x \  <  *  -  • 

{1 . r}  of  .si  indices  that  correspond  to  linearly  independent  rows 

of  U.  Let  U a  and  M j  be  the  submatrices  of  U  and  M%  respectively, 
consisting  of  the  s  i  rows  corresponding  to  J.  Then  U  j  is  nonsingular, 
and  the  value  of  the  matrix  representation  of  the  corresponding  decoded 
packets  is  uniformly  distributed  over  the  set 

{AO+tfVi:  R’EF*,1*'1,  dctfJO/O}.  (7) 

Let  //  be  the  rank  of  V\ .  Consider  a  set  of  v  linearly  independent 
rows  of  V \ .  Denote  by  T  the  corresponding  set  of  row'  indices,  and 
denote  by  Vj  the  submatrix  of  V i  consisting  of  those  rows.  Wc  can 
wnte 


<  i.,' \  C 


V ,  =  LVz 


Conditioned  on  r„  being  in  the  row  space  of  Rn-i .  rn  =  gRn-\ 
where  g  is  uniformly  distributed  over  FJ-1 .  Since  Vj  has  linearly  in¬ 
dependent  rows,  Rn~\Vz  7^  0,  and  by  Corollary  2,  the  corresponding 
decoded  packet 


m,M  +  tnVz  =  mtrt  +gRn~\Vi 

is  consistent  with  probability  at  most 
Conditioned  on  r„  not  being  in  the  row  space  of  Rn-t ,  we  can  par¬ 
tition  the  set  of  possible  values  for  rn  into  cosets 

{r  +  gRn-i  g  E  Fj  1 } 


where  r  is  not  in  the  row  space  of  Rn-i\  the  corresponding  values  of 
the  in  th  decoded  packet  are  given  by 

{m^+rVj+gRn^V 7:  gE  FT’}. 


Noting  as  before  that  Rn-\Vi  ^0  and  applying  Corollary  2,  the  i„  th 
decoded  packet  is  consistent  with  probability  at  most  ^i.  □ 

Proof  of  Corollary  1:  Suppose  two  or  more  different  sets  of 
packets  are  used  for  decoding.  If  not  all  of  them  contain  at  least 
one  nongenuine  packet,  the  decoded  values  obtained  from  different 
decoding  sets  will  differ  sets  containing  only  genuine  packets  will  be 
decoded  to  M,  while  sets  containing  one  or  more  nongenuine  packets 
will  noL  This  will  indicate  an  error. 

Otherwise,  suppose  all  the  decoding  sets  contain  at  least  one  non¬ 
genuine  packet.  Let  5  denote  the  set  of  s  secrecy-independent  packets. 
Consider  the  decoding  sets  in  turn,  denoting  by  the  number  of  un¬ 
modified  packets  from  5  in  the  ith  decoding  set  that  are  not  in  any  set 
j  <  i.  Conditioned  on  any  fixed  values  of  packets  in  sets  j  <  1,  there 
remain  s[  secrecy -independent  packets  in  the  *  th  decoding  set,  and  we 

have  from  Theorem  I  that  at  most  a  fraction  of  decoding  out¬ 

comes  for  set  i  have  consistent  data  and  hash  values.  Thus,  the  overall 
fraction  of  consistent  decoding  outcomes  is  at  most 


where  L  E  FJ1  has  full  rank  v.  We  define  Ri  =  R' L,  noting  that 
RjVj  =  It  LVz  =  RV  t 

and  that  Rj  is  uniformly  distributed  over  all  matrices  in  FJ,XI/  that 
have  full  rank  u.  Thus,  (7)  becomes 

{Mj  +  RrVz  :  Ri  E  FJ1  rank(ilz)  =  v)  .  (8) 

Denote  by  r  t . rM  the  row  s  of  i?j,  and  by  Rn  the  submatrix  of 

Ri  consisting  of  its  first  n  rows.  We  consider  the  rows  sequentially, 

starting  with  the  first  row  r  1.  For  n  =  1 . .slf  we  wall  show  that 

conditioned  on  any  given  value  of  the  probability  that  the  ?„th 
decoded  packet  A/Jri  +rnTj  is  consistent  is  at  most 

Case  A:  Rn- 1  has  zero  rank.  This  is  the  case  if  n  =  1,  or  if  ix  >  1 
and  Rn- 1  =  0. 

Suppose  we  remove  the  restriction  rank(J2j)  =  v%  so  that  r„  is 
uniformly  distributed  over  FJ.  By  Corollary  2,  m,n  +  rnVz  would 
have  consistent  data  and  hash  values  with  probability  at  most  .  With 
the  restriction  rank  (Rj)  =  v%  the  probability  of  rn  being  equal  to  0 
is  lowered.  Since  the  corresponding  decoded  packet  m,n  -I -  r„Vz  is 
consistent  for  r„  =  0,  the  probability  that  it  is  consistent  is  less  than 

) • 

Case  B:  n  >  1  and  Rn  _  1  has  nonzero  rank. 


v.  Conclusion 

Wc  have  described  an  information-theoretic  approach  for  detecting 
Byzantine  modifications  in  networks  employing  random  linear  network 
coding.  Byzantine  modification  detection  capability  is  added  by  aug¬ 
menting  each  packet  with  a  small,  flexible  number  of  hash  symbols; 
tliis  overhead  can  be  traded  off  against  the  detection  probability  and 
symbol  length.  The  hash  symbols  can  be  obtained  as  a  simple  poly¬ 
nomial  function  of  the  data  symbols.  The  only  necessary  condition  is 
that  the  adversarial  packets  are  not  all  designed  with  know  ledge  of  the 
random  coding  coefficients  of  all  other  packets  received  by  the  sink 
nodes.  This  approach  can  be  used  in  conjunction  with  higher  overhead 
schemes  that  are  activated  only  upon  detection  of  a  Byzantine  node. 
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Abstract — The  continuous  variable  quantum  key  distribution  has  been 
considered  to  have  the  potential  to  provide  high  secret  key  rate.  However, 
in  present  experimental  demonstrations,  the  secret  key  can  be  distilled  only 
under  very  small  loss  rates.  Here,  by  calculating  explidUy  the  computa¬ 
tional  complexity  w  ith  the  channel  transmission,  we  show  that  under  high 
loss  rate  It  is  hard  to  distill  the  secret  key  in  present  continuous  variable 
scheme  and  one  of  its  advantages,  the  potential  of  providing  high  secret 
key  rate,  may  therefore  he  limited. 

Index  Terms — Computational  complexity’,  continuous  variable  (CV), 
error  correction,  quantum  key  distribution  (QKD),  reconciliation. 


I.  INTRODUCTION 

Due  to  its  potential  for  achieving  high  modulation  and  detection 
speed,  continuous  variable  (CV)  quantum  key  distribution  (QKD)  has 
recently  attracted  more  and  more  attention.  Compared  to  single  photon 
counting  schemes,  CVQKD  does  not  require  single  photon  sources 
and  detectors  which  are  technically  challenging  now.  The  CVQKD 
schemes  typically  use  the  quadrature  amplitude  of  light  beams  as  infor¬ 
mation  carrier,  and  homodyne  detection  rather  than  photon  counting. 
Some  of  these  schemes  use  nonclassical  states,  such  as  squeezed  slates 
[1 1  or  entangled  states  [2],  while  others  use  coherent  states  [3]-[6].  Be¬ 
cause  the  squeezed  states  and  entangled  stales  are  sensitive  to  losses  in 
the  quantum  channel,  coherent  states  are  much  more  attractive  for  long 
distance  transmission.  To  improve  the  performance  of  the  CVQKD 
against  the  channel  loss,  Grosshans  etal.  proposed  a  reverse  reconcilia¬ 
tion  (RR)  protocol  [  1 1  ].  In  the  traditional  direct  reconciliation  protocol, 
Alice  sends  Bob  the  quantum  state  and  also  sends  the  reconciliation 
information  later. *  Finally,  Bob  obtains  Alice’s  data  without  any  error. 
However,  in  the  reverse  reconciliation  protocol,  the  quantum  state  is 
sent  by  Alice  to  Bob,  but  the  reconciliation  information  is  sent  by  Bob 
to  Alice.  Finally,  Alice  gets  Bob's  received  data  with  no  error. 

Tabletop  experimental  setups  that  encode  information  in  the  phase 
and  amplitude  of  coherent  slates  have  been  demonstrated  [7],  [8],  and 
recent  experiments  have  shown  the  feasibility  of  CVQKD  in  optical 
fibers  up  to  a  distance  of  55  km  [9],  [10],  but  without  obtaining  the 
final  secret  keys. 

Unlike  the  single  photon  QKD  schemes,  many  CVQKD  schemes  uti¬ 
lize  the  inertial  quantum  noise  to  protect  information  from  Eve’s  attack 
[7],  [12).  However,  at  the  same  time  the  quantum  noise  also  causes  er¬ 
rors  between  two  legitimate  communicators,  Alice  and  Bob.  It  is  widely 
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•in  the  following,  wc  use  the  conventional  appellation.  Alice  is  the  quantum 
slate  sender  and  Bob  is  ihe  quantum  state  receiver. 
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Abstract  The  use  of  network  coding  in  military  networks 
opens  many  interesting  issues  for  security.  The  miring  of 
data  inherent  to  network  coding  may  at  first  appear  to  pose 
challenges,  but  it  also  enables  new  security  approaches. 
In  this  paper,  we  overview  the  recent  current  theoretical 
understanding  and  application  areas  for  network-coding 
based  security  in  the  areas  of  robustness  to  Byzantine 
attackers  and  of  distributed  signature  schemes  for  dow  n¬ 
loads. 

I  Introduction 

The  Global  Information  Gnd  (GIG)  is  the  infrastructure 
used  to  conduct  Net-Centric  Operations  (NCO).  The  GIG 
is  intended  to  be  a  single  inform ation-shanng  network  with 
multiple  levels  of  security  and  bandwidth  capabilities  in  nct- 
cenlnc  environment  A  net-centric  information  environment 
is  inclusive  of  Core  and  Communities  of  Interest  (COI) 
enterprise  services,  a  data  sharing  strategy,  and  the  Task-Post- 
Process-Use  (TPPU)  paradigm  The  Global  Information  Gnd 
Bandwidth  Expansion  (GIG-BE)  Program  was  a  major  DoD 
net-centric  transformational  iniUative  executed  by  Defense 
Information  System  Agency  (DISA). 

The  ulUmatc  purpose  of  the  GIG-BE  projects  is  to  provide  a 
secure  and  reliable  platform  to  enable  worldwide  Net-Centric 
Operahons  for  intelligence,  surveillance  and  reconnaissance 
and  command  and  control  massive  amounts  of  informa¬ 
tion  sharing  by  providing  bandwidth-available’'  environment. 
Through  GIG-BE,  DISA  leveraged  DODs  existing  end-to- 
end  information  transport  capabilities,  significantly  expanding 
capacity  and  reliability  to  select  Joint  Staff-approved  locations 
worldwide  and  under  new  hardware  and  software  contracts  to 
build  a  ccan  muni  cations  infrastructure  The  GIG-BE  that  is  in¬ 
tended  to  provide  high-capacity  communications  Unking  DoD 
users  at  locations  worldwide  is  a  ground-based  optical  network 
with  up  to  10-Gbps  connections  and  averaging  105  Gbps  per 
link  on  the  backbone  networks.  The  GIG-BE  program  has 
greatly  contributed  to  the  development  of  the  real-time  Nct- 
Centric  Operations.  However,  a  bottleneck  link  problem  exists 
between  core  networks  and  edge  networks  due  to  the  enormous 
difference  in  bandwidths. 

The  DoD  supports  NCO  and  GIG-BE  projects  to  improve 
quality  of  services  in  net- centric  environment  The  current 
coding  systems  will  not  be  appropriate  in  the  near  future 
However,  coding  based  scalable  communication  technology 
has  not  been  applied  to  the  Net-Centric  Operations.  This  tech¬ 
nology  will  satisfy  the  bandwidth  requirements  of  tomorrow’s 
warfighters. 


Network  coding  is  a  recent  development  in  which  nodes 
m  the  network  arc  allowed  to  perform  algebraic  operations 
inside  the  network.  This  scheme  was  first  introduced  in  [1] 
and  a  powerful  algebraic  framework,  which  allows  further 
developments,  was  provided  in  [2],  [3].  For  multicast  settings, 
it  was  shown  in  [4],  [5]  that  network  coding  performed  in  a 
distributed,  random  fashion  is  with  high  probability  optimal 
A  tutorial  on  network  coding  can  be  found  in  [6],  [7]. 

The  specifics  of  the  Scalable  Information  Operations  (SIO) 
mclude:  1.  scalable  coding  techniques  for  network  coding, 
compression,  channel  coding,  multimedia  data  transm ission, 
encryption,  data  sharing,  data  anonymization,  meta  database 
management,  caching,  network  security,  and  intrusion  detec¬ 
tion.  2  Bottleneck  flow  control  The  purpose  of  this  paper 
is  to  overview  some  of  the  recent  developments  in  applying 
network  coding  to  security  in  the  areas  of  detection  and 
correction  of  Byzantine  attacks,  and  of  cryptography  for 
network  coding  based  file  downloads.  The  aim  of  this  paper 
is  mainly  tutorial  and  further  technical  details  can  be  found  in 
[8],  [9].  Especially,  our  goal  is  to  sketch  how  network-coding 
based  scalable  information  operations  will  mitigate  some  of 
the  security  issues  in  the  future  net-centric  environment 

II.  Network-coding  based  detection  and 

CORRECTION  OF  BYZANTINE  ATTACKERS 

The  mixture  of  data  that  occurs  in  network  coding  can  lead 
to  pollution  attacks  through  rogue,  or  Byzantine,  nodes  in  the 
network  [10],  [11],  Such  nodes  may  be  unreliable  through 
failure  or  because  of  their  being  compromised.  While  the  use 
of  network  coding  would  at  first  appear  to  render  the  problem 
of  Byzantine  attackers  worse,  it  actually  provides  some  strong 
protection  for  both  the  detection  and  correction  of  such  nodes. 
The  results  in  this  section  have  previously  appeared  in  more 
detailed  form  in  [8],  We  consider  network  error  correction  in  a 
distributed  packet  network  setting  with  random  linear  network 
coding  using  coding  vectors.  A  batch  of  r  packets  is  multicast 
from  a  source  node  s  to  a  set  of  sink  nodes.  An  omniscient 
adversary  can  arbitrarily  corrupt  the  coding  vector  as  well  as 
the  data  symbols  of  up  to  z0  packets  A  packet  that  is  not  a 
linear  combination  of  its  input  packets  is  called  adwrsanal 
We  describe  below  a  polynomial-complexity  network  error- 
correcting  code  whose  parameters  depend  on  z0,  the  maximum 
number  of  adversarial  packets,  and  m,  the  minimum  source- 
sink  cut  capacity  (maximum  eiror-frce  multicast  rate)  in  units 
of  packets  over  the  batch.  The  number  of  packets  in  the  batch 
is  set  as  r  =  m  -  zQ.  The  proportion  of  redundant  symbols  in 
each  packet,  denoted  p,  is  set  as  p  =  (z0-f  e)/r  for  some  c  >  0 
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The  corresponding  information  rate  of  the  code  approaches 
m  —  2z0  asymptotically  as  the  packet  size  increases.  If  instead 
of  an  omniscient  adversary  we  assume  that  the  source  and 
sinks  share  a  secret  channel  not  observed  by  the  adversary,  a 
higher  rate  of  m  —  z0  is  asymptotically  achievable.  Below  we 
give  the  details  of  the  code  for  the  omniscient  adversary  case 
For  t  as  1 , . . . ,  r,  the  t‘th  source  packet  is  represented  as  a 
length-n  row  vector  x,  with  entries  in  a  finite  field  Fq.  The  first 
n-pn-T  entries  of  the  vector  are  independent  exogenous  data 
symbols,  the  next  pn  are  redundant  symbols,  and  the  last  r 
symbols  form  the  packet’s  coding  vector  (the  unit  vector  with 
a  single  nonzero  entry  in  the  ith  position).  We  denote  by  X  the 
rxn  matrix  whose  ith  row  is  x$;  it  can  be  written  in  the  block 
form  U  R  I  ]  where  U  denotes  the  r  x  (n  ~  pn  —  r) 
matrix  of  exogenous  data  symbols,  R  denotes  the  r  x  pn  matrix 
of  redundant  symbols  and  I  is  the  r  x  r  identity  matrix. 

The  rpn  redundant  symbols  are  obtained  as  follows.  For 
any  matrix  M,  let  v£j  denote  the  column  vector  obtained 
by  stacking  the  columns  of  M  one  above  the  other,  and  vm 
its  transpose,  a  row  vector.  Matrix  X,  represented  in  column 
vector  form,  is  given  by  v£  =  [vlnVr,vi]t.  Let  D  be 
an  rpn  x  rn  matrix  obtained  by  choosing  each  entry  inde¬ 
pendently  and  uniformly  at  random  from  Fq  The  redundant 
symbols  constituting  vr  (or  R)  are  obtained  by  solving  the 
matrix  equation 

D[vu.  vr,  vi]7,  =  0  (1) 

for  vr  The  value  of  D  is  known  to  all  parties. 

An  adversarial  packet  can  be  viewed  as  an  additional  source 
packet.  The  vector  representing  the  ith  adversarial  packet  is 
denoted  z<  Let  Z  denote  the  matrix  whose  ith  row  is  z* 

We  focus  on  any  one  of  the  sink  nodes  t.  Let  w  be 
the  number  of  linearly  independent  packets  received  by  t\ 
let  Y  €  F“*n  denote  the  matrix  whose  ith  row  is  the 
vector  representing  the  ith  of  these  packets.  Since  all  coding 
operations  in  the  network  are  scalar  linear  operations  in  Fq, 
Y  can  be  be  expressed  as 

Y  =  GX  +  KZ  (2) 

where  matrices  G  e  F^xr  and  K  e  F^2  represent  the  linear 
mappings  from  the  source  and  adversarial  packets  respectively 
to  the  sink’s  set  of  linearly  independent  input  packets. 

Since  the  last  r  columns  of  X  form  an  identity  matrix,  the 
matrix  G'  formed  by  the  last  r  columns  of  Y  is  given  by 

G'  =  G  +  KL,  (3) 

where  L  is  the  matrix  formed  by  the  last  r  columns  of  Z.  The 
sink  knows  G'  but  not  G.  Thus,  we  rewrite  (2)  as 

Y  =  G'X  +  K(Z-LX) 

=  G'X  4-  E  (4) 

Matrix  E  gives  the  difference  between  the  data  values  in  the 
received  packets  and  the  data  values  corresponding  to  their 
coding  vectors;  its  last  r  columns  are  all  zero 


Lemma  1:  With  probability  at  least  1  -  77 /<?,  the  matrix  G' 
has  full  column  rank,  where  77  is  the  number  of  links  in  the 
network. 

The  decoding  process  at  sink  t  is  as  follows.  First,  the  sink 
determines  zy  the  minimum  cut  from  the  adversarial  packets 
to  the  sink.  This  is  with  high  probability  equal  to  w  —  r.  Next, 
it  chooses  2  columns  of  Y  that,  together  with  the  columns 
of  G',  form  a  basis  for  the  column  space  of  Y  We  assume 
without  loss  of  generality  that  the  first  z  columns  are  chosen, 
and  we  denote  the  corresponding  submatrix  G".  Rewriting  Y 
in  the  basis  corresponding  to  the  matrix  [G"  G'],  we  have 


Y  =  [G"  G'] 


I.  Yz  0 
0  YX  lr 


(5) 


This  can  be  reduced  by  linear  algebraic  manipulations  to 


G'X2  =  G'(Yx+  XiYz)  (6) 


where  Xi,X2  are  the  matrices  formed  by  the  first  2  columns 
of  X  and  the  next  n  —  z  —  r  columns  of  X  respectively. 

Proposition  1:  With  probability  greater  than  1  — qn equa¬ 
tions  (1)  and  (6)  can  be  solved  simultaneously  to  recover  X 
The  decoding  algorithm  has  complexity  0(n3m3). 


III.  Cryptography  for  content  distribution  with 

NETWORK  CODING 

A.  Background 

Recently,  several  researchers  explored  the  use  of  network 
coding  in  peer-to-peer  (P2P)  content  distribution  and  distrib¬ 
uted  storage  systems  [12],  [13],  [14]  A  P2P  network  has  a 
fully  distributed  architecture,  and  the  peers  in  the  network 
form  a  cooperative  network  that  shares  the  resources,  such 
as  storage,  CPU,  and  bandwidth,  of  all  the  computers  in  the 
network  This  architecture  offers  a  cost-effective  and  scalable 
way  to  distribute  software  updates,  videos,  and  other  large  files 
to  a  large  number  of  users. 

The  best  example  of  a  P2P  cooperative  architecture  is  the 
BitTorrent  system  [15],  which  splits  large  files  into  small 
blocks,  and  after  a  node  dowTiloads  a  block  from  the  original 
server  or  from  another  peer,  it  becomes  a  server  for  that 
particular  block.  Although  BitTorrent  has  become  extremely 
popular  for  distribuUon  of  large  files  over  the  Internet,  it 
may  suffer  from  a  number  of  inefficiencies  which  decrease  its 
overall  performance.  For  example,  scheduling  is  a  key  problem 
in  BitTorrent;  it  is  difficult  to  efficiently  select  which  block(s) 
to  download  first  and  from  where.  If  a  rare  block  is  only 
found  on  peers  with  slow  connections,  this  would  create  a 
bottleneck  for  all  the  downloaded.  Several  ad  hoc  strategies 
are  used  in  BitTorrent  to  ensure  that  different  blocks  are 
equally  spread  in  the  system  as  the  system  evolves.  References 
[12],  [13]  propose  the  use  of  network  coding  to  increase 
the  efficiency  of  content  distribution  in  a  P2P  cooperative 
architecture  The  main  idea  of  this  approach  is  the  following. 
The  server  breaks  the  file  to  be  distributed  into  small  blocks, 
and  w’henever  a  peer  requests  a  file,  the  server  sends  a  random 
linear  combination  of  all  the  blocks.  As  in  BitTorrent,  a  peer 
acts  as  a  server  to  the  blocks  it  has  obtained  However,  in  a 
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linear  coding  scheme,  any  output  from  a  peer  node  is  also 
a  random  linear  combination  of  all  the  blocks  it  has  already 
received.  A  peer  node  can  reconstruct  the  whole  file  when 
it  has  received  enough  degrees  of  freedom  to  decode  all  the 
blocks.  This  scheme  is  completely  distributed,  and  eliminates 
the  need  for  a  scheduler,  as  any  block  transmitted  contains 
partial  information  of  all  the  blocks  that  the  sender  possesses. 
It  has  been  shown  both  mathematically  [12]  and  through  live 
trials  [16]  that  the  random  linear  coding  scheme  significantly 
reduces  the  downloading  time  and  improves  the  robustness  of 
the  system 

A  major  concern  for  any  network  coding  system  is  the 
protection  against  malicious  nodes.  Take  the  above  content 
distribution  system  for  example.  If  a  node  in  the  P2P  network 
behaves  maliciously,  it  can  create  a  polluted  block  with 
valid  coding  coefficients,  and  then  sends  it  out  Here,  coding 
coefficients  refer  to  the  random  linear  coefficients  used  to 
generate  this  block  If  there  is  no  mechanism  for  a  peer  to 
check  the  integrity  of  a  received  block,  a  receiver  of  this 
polluted  block  would  not  be  able  to  decode  anything  for  the 
file  at  all,  even  if  all  the  other  blocks  it  has  received  are  valid. 
To  make  things  worse,  the  receiver  would  mix  this  polluted 
block  with  other  blocks  and  send  them  out  to  other  peers,  and 
the  pollution  can  quickly  propagate  to  the  whole  network  This 
makes  coding  based  content  distribution  even  more  vulnerable 
than  the  traditional  P2P  networks,  and  several  attempts  were 
made  to  address  this  problem  References  [12],  [17]  proposed 
to  use  homomorphic  hash  functions  in  content  distribution 
systems  to  detect  polluted  packets,  and  [18]  suggested  the  use 
of  a  Secure  Random  Checksum  (SRC)  which  requires  less 
computation  than  the  homomorphic  hash  function.  However, 
[18]  requires  a  secure  channel  to  transmit  the  SRCs  to  all 
the  nodes  in  the  network  Charles  et  al  [19]  proposed  a 
signature  scheme  based  on  Weil  pairing  on  elliptic  curves  and 
provides  authentication  of  the  data  in  addition  to  pollution 
detection,  but  the  computation  complexity  of  this  solution  is 
quite  high.  Moreover,  the  security  offered  by  elliptic  curves 
that  admit  Weil  pairing  is  still  a  topic  of  debate  in  the  scientific 
community 

In  this  section,  we  overview  a  new  signature  scheme, 
presented  in  greater  detail  in  [9],  that  is  not  based  on  elliptic 
curves,  and  is  designed  specifically  for  random  linear  coded 
systems.  We  view  all  blocks  of  the  file  as  vectors,  and  make 
use  of  the  fact  that  all  valid  vectors  transmitted  in  the  network 
should  belong  to  the  subspacc  spanned  by  the  original  set  of 
vectors  from  the  file.  We  present  a  signature  that  can  be  used 
to  easily  check  the  membership  of  a  received  vector  in  the 
given  subspace,  and  at  the  same  time,  it  is  hard  for  a  node  to 
generate  a  vector  that  is  not  in  that  subspace  but  passes  the 
signature  test  We  show  that  this  signature  scheme  is  secure, 
and  that  the  overhead  for  the  scheme  is  negligible  for  large 
files 

B.  Problem  Setup 

We  model  the  network  by  a  directed  graph  Gj  =  (A,  A)y 
where  A  is  the  set  of  nodes,  and  A  is  the  set  of  communication 


links.  A  source  node  $  €  A  wishes  to  send  a  large  file,  of  size 
M,  to  a  set  of  client  nodes,  T  C  Ar,  and  we  refer  to  all  the 
clients  as  peers.  The  large  file  is  divided  into  m  blocks,  and 
any  peer  receives  different  blocks  from  the  source  node  or 
from  other  peers.  In  this  framework,  a  peer  is  also  a  server  to 
blocks  it  has  downloaded,  and  always  sends  out  random  linear 
combinations  of  all  the  blocks  it  has  obtained  so  far  to  other 
peers.  When  a  peer  has  received  enough  degrees  of  freedom 
to  decode  the  data,  i.e.,  it  has  received  m  linearly  independent 
blocks,  it  can  re-construct  the  whole  file. 

Specifically,  we  view  the  m  blocks  of  the  file,  v\ 
as  elements  m  n-dimensional  vector  space  F]J,  where  p  is 
a  prime  The  source  node  augments  these  vectors  to  create 
vectors  vlt..M  vm,  given  by 

Vf  —  (0,  ltM.,0,  Vjj, 

where  the  first  m  elements  arc  zero  except  that  the  tth  one  is 
1,  and  Vij  €  Fp  is  the  j\h  element  in  v<.  Packets  received  by 
the  peers  are  linear  combinations  of  the  augmented  vectors, 
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w  = 

i=i 

where  0i  is  the  weight  of  v*  in  w  We  see  that  the  additional 
m  elements  in  the  front  of  the  augmented  vector  keep  track 
of  the  code  vector,  0,  of  the  corresponding  packet. 

As  mentioned  m  the  previous  subsection,  this  kind  of 
network  coding  scheme  is  vulnerable  to  pollution  attacks  by 
malicious  nodes  Unlike  uncoded  systems  where  the  source 
knows  all  the  blocks  being  transmitted  in  the  network,  and 
therefore,  can  sign  each  one  of  them,  in  a  coded  system,  each 
peer  produces  “new”  packets,  and  standard  digital  signature 
schemes  do  not  apply  here.  In  the  next  subsection,  we  intro¬ 
duce  a  novel  signature  scheme  for  the  coded  system. 

C.  Signature  scheme  for  nefrvork  coding 

We  note  that  the  vectors  span  a  subspace  V  of 

FJJI'H\  and  a  received  vector  w  is  a  valid  linear  combination  of 
vectors  v x,...,  vm  if  and  only  if  it  belongs  to  the  subspace  V 
This  is  the  key  observation  for  our  signature  scheme.  In  the 
scheme  described  below,  we  present  a  system  that  is  based 
upon  standard  modulo  arithmetic  (in  particular  the  hardness 
of  the  Discrete  Logarithm  problem)  and  upon  an  invariant 
signature  a(V)  for  the  linear  span  V.  Each  node  verifies  the 
integrity  of  a  received  vector  w  by  checking  the  membership 
of  w  in  V  based  on  the  signature  o(V). 

Our  signature  scheme  is  defined  by  the  following  ingredi¬ 
ents,  which  are  independent  of  the  file(s)  to  be  distributed 

•  q  a  large  prime  number  such  that  p  is  a  divisor  of  q  —  1 . 
Note  that  standard  techniques,  such  as  that  used  in  Digital 
Signature  Algorithm  (DSA),  apply  to  find  such  q. 

•  g  a  generator  of  the  group  G  of  order  p  tn  F7.  Since  the 
order  of  the  multiplicative  group  F*  is  q  —  1,  which  is  a 
multiple  of  p,  we  can  ahvays  find  a  subgroup,  G ,  with 
order  p  in  F* 

•  Private  key  Kpr  =  {a,}*=lv  m+„,  a  random  set  of 
elements  in  F*.  K w  is  only  known  to  the  source. 
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•  Public  key;  Kpu  =  {/i*  =  9Q'}f=i,...,m+n.  Kptl  is 
signed  by  some  standard  signature  scheme*  e  g.,  DSA, 
and  published  by  the  source. 

To  distribute  a  file  in  a  secure  manner,  the  signature  scheme 
works  as  follows. 

1)  Using  the  vectors  from  the  file,  the  source 

finds  a  vector  u  =  (uj, ...,um+n)  €  F£*+n  orthogonal 
to  all  vectors  in  V  Specifically,  the  source  finds  a  non¬ 
zero  solution,  u,  to  the  equations 

V|  •  u  =  0,  i  =  1, 

2)  The  source  computes  vector  x  =  (ui/a\,  1*2/02, 

^m-fn/ am+n) 

3)  The  source  signs  x  with  some  standard  signature  scheme 
and  publishes  x  We  refer  to  the  vector  x  as  the 
signature,  a(V’),  of  the  file  being  distributed. 

4)  The  client  node  verifies  that  x  is  signed  by  the  source. 

5)  When  a  node  receives  a  vector  w  and  wants  to  venfy 
that  w  is  in  Vy  it  computes 

m+n 

d=  ii 

and  verifies  that  d  =  1 

To  see  that  d  is  equal  to  1  for  any  valid  w,  we  have 

m+n 

d=  l]  =  i 

where  the  last  equality  comes  from  the  fact  that  u  is  orthogonal 
to  all  vectors  in  V. 

Next,  we  show  that  the  system  described  above  is  secure  In 
essence,  the  theorem  below  shows  that  given  a  set  of  vectors 
that  satisfy  the  signature  verification  criterion,  it  is  provably 
as  hard  as  the  Discrete  Logarithm  problem  to  find  new  vectors 
that  also  satisfy  the  verification  criterion  other  than  those  that 
are  in  the  linear  span  of  the  vectors  already  known. 
Definition  1.  Let  p  be  a  pnme  number  and  G  be  a  multi¬ 
plicative  cyclic  group  of  order  p.  Let  k  and  n  be  two  integers 
such  that  k  <  n,  and  T  =  {h\,...yhn}  be  a  set  of  generators 
of  G  Given  a  linear  subspacc,  V  y  of  rank  h  in  FJ)  such  that 
for  every  v  €  Vy  the  equality  Fv  =  =  1  holds,  we 

define  the  (p,  A;,  n)-Diffie-Hellman  problem  as  the  problem  of 
finding  a  vector  w  €  F£  with  Tw  =  1  but  w  £  V 

By  this  definition,  the  problem  of  finding  an  invalid  vector 
that  satisfies  our  signature  verification  criterion  is  a  (p,m,m+ 
n)-Diftie-Hellman  problem 

Theorem  1.  For  any  k  <  n  -  1,  the  (p,  fc,n)-Diffic-Hellman 
problem  is  as  hard  as  the  Discrete  Logarithm  problem. 

Proof:  Assume  that  we  have  an  efficient  algorithm  to 
solve  the  (p,  k,  n)-Diffie-Hellman  problem,  and  we  wish  to 
compute  the  discrete  algorithm  logp(z)  for  some  z  =  gx, 
where  g  is  a  generator  of  a  cyclic  group  G  with  order  p 
We  can  choose  two  random  vectors  r  =  (r*i,...,rn)  and 
s  =  (si,...Tsn)  in  Fp,  and  construct  T  =  {/ut. where 


h(  =  zr  ga'  for  i  =  We  then  find  k  linearly  indepen¬ 

dent  (and  otherwise  random)  solution  vectors  v*  to  the 
equations 

v  •  r  =  0  and  v  •  s  =  0. 

Note  that  there  exist  n-2  linearly  independent  solutions  to  the 
above  equations  Let  V  be  the  linear  span  of  {vx,...,  vfc},  it  is 
clear  that  any  vector  v  €  V  satisfies  Tv  —  1 .  Now,  if  we  have 
an  algorithm  for  the  (p,  A:,n)-Diffie-Hellman  problem,  we  can 
find  a  vector  w  £  V  such  that  Fw  =  1  This  vector  would 
satisfy  w  •  (xr  4-  s)  =  0.  Since  r  is  statistically  independent 
from  (xr  -f  s),  with  probability  greater  than  1  —  1/p,  we  have 
w  •  r  ^  0.  In  this  case,  we  can  compute 

lo*f  (*)«*  =  —. 

w  •  r 

This  means  the  ability  to  solve  the  (p,  /c,n)-Difiie-Hellman 
problem  implies  the  ability  to  solve  the  Discrete  Logarithm 
problem  ■ 

This  proof  is  an  adaptation  of  a  proof  that  appeared  in  an 
earlier  publication  by  Boneh  el  al  [20]. 

D.  Discussion 

Our  signature  scheme  nicely  makes  use  of  the  linearity 
property  of  random  linear  network  coding,  and  enables  the 
peers  to  check  the  integrity  of  packets  without  the  requirement 
for  a  secure  channel  Also,  the  computation  involved  in  the 
signature  generation  and  venfi cation  processes  is  very  simple. 

Next,  we  examine  the  overhead  incurred  by  this  signature 
scheme  The  size  of  each  file  block  is  jB  =  n  log(p)  and  we 
have  M  —  mnlog(p).  The  size  of  each  augmented  vector 
(with  coding  vectors  in  the  front)  is  Ba  =  (m  +  n)  log(p), 
and  thus,  the  overhead  of  the  coding  vector  is  m/n  times 
the  file  size.  Note  that  this  is  the  overhead  pertaining  to  the 
linear  coding  scheme  not  to  our  signature  scheme,  and  any 
practical  network  coding  system  would  make  m  n.  The 
initial  setup  of  our  signature  scheme  involves  the  publishing 
of  the  public  key,  which  has  size  (m  -f  n)  log(g).  In 
typical  cryptographic  applications,  the  size  of  p  is  20  bytes 
(160  bits),  and  the  size  of  q  is  128  bytes  (1024  bits),  thus,  the 
size  of  Kpy  is  approximately  equal  to  6(m  +  n)/mn  times 
the  file  size 

For  distribution  of  each  file,  the  incremental  overhead  of 
our  scheme  consists  of  two  parts:  the  public  data,  Kpti,  and 
the  signature  vector,  x. 

For  the  public  key,  K^,  we  note  that  it  cannot  be  fully 
reused  for  multiple  files,  as  it  is  possible  for  a  malicious  node 
to  generate  a  invalid  vector  that  satisfies  the  check  d  =  1 
using  information  obtained  from  previously  downloaded  files 
To  prevent  this  from  happening,  we  can  publish  a  new  public 
key  for  each  file,  and  as  mentioned  above,  the  overhead  is 
about  6(m  4-  n)/mn  times  the  file  size,  which  is  small  as 
long  as  6  <  m  <  n 

Alternatively,  for  every  new  file,  we  can  randomly  pick  an 
integer  i  between  1  and  m  4-  n,  select  a  new  random  value 
for  at  in  the  private  key,  and  just  publish  the  new  h ,•  =  g 
The  overhead  for  this  method  is  only  6/mn  times  the  file 
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size.  As  an  example,  if  we  have  a  file  of  size  10MB,  divided 
into  m  =  100  blocks,  the  value  of  n  would  be  in  the  order  of 
thousands,  and  thus,  this  overhead  is  less  than  0.01%  of  the  file 
size.  This  method  should  provide  good  security  except  in  the 
case  where  we  expect  the  vector  w  to  have  low  variability, 
for  example,  has  many  zeros.  Security  can  be  increased  by 
changing  more  elements  in  the  private  key  for  each  new  file 

In  addition,  for  each  new  file  distributed,  we  also  have  to 
publish  a  new  signature  x,  which  is  computed  from  a  vector 
u  that  is  orthogonal  to  the  subspace  V  spanned  by  the  file 
Since  the  V  has  dimension  m,  it  is  sufficient  to  only  replace  m 
elements  m  u  to  generate  a  vector  orthogonal  to  the  new  file 
Since  the  first  m  elements  in  the  vectors  Vi, vm  are  always 
linearly  independent  (they  are  the  code  vectors),  it  suffices  to 
just  modify  the  entries  to  um  Assume  that  the  i\h  element 
in  the  private  key  is  the  only  one  that  has  been  changed  for 
the  distribution  of  the  new  file,  and  that  i  is  between  1  and  m, 
then  we  only  need  to  publish  x\  to  xm  for  the  new  signature 
vector  This  part  of  the  overhead  has  size  m  log(p).  and  the 
ratio  between  this  overhead  and  the  original  file  size  N  is  1/n. 
Again,  take  a  10MB  file  for  example,  this  overhead  is  less  than 
0.1%  of  the  file  size. 

Therefore,  after  the  initial  setup,  each  additional  file  dis¬ 
tributed  only  incurs  a  negligible  amount  of  overhead  using 
our  signature  scheme. 

Finally,  we  would  like  to  pomt  out  that,  under  our  assump¬ 
tions  that  there  is  no  secure  side  channel  from  the  source  to 
all  the  peers  and  that  the  public  key  is  available  to  all  the 
peers,  our  signature  scheme  has  to  be  used  on  the  original 
file  vectors  not  on  hash  functions  This  is  because  to  maintain 
the  security  of  the  system,  we  need  to  use  a  one-way  hash 
function  that  is  homomorphic,  however,  we  are  not  aware  of 
any  such  hash  function.  Although  [12]  and  [17]  suggested 
usage  of  homomorphic  hash  functions  for  network  coding, 
[12]  assumed  that  the  intermediate  nodes  do  not  know  the 
parameters  used  for  generating  the  hash  function,  and  [17] 
assumed  that  a  secure  channel  is  available  to  transmit  the  hash 
values  of  all  the  blocks  from  the  source  node  to  the  peers 
Under  our  more  relaxed  assumptions,  these  hash  functions 
would  not  worlc 

IV  Conclusions 

In  this  paper,  we  have  overviewed  some  of  the  security  capa¬ 
bilities  of  network  coding,  particularly  in  the  area  of  robustness 
to  Byzantine  attacks  and  to  distributed  authentication  in  peer- 
to-peer  downloads  The  implications  of  network  coding  for 
security  are  not  limited  to  these  applications.  For  instance, 
network  coding’s  mixture  of  data  can  be  used  to  use  data 
for  effective  countermeasures  to  eavesdropping.  In  effect,  data 
is  used,  after  compression,  as  a  one-time-pad  in  the  system, 
[21],  [22],  [23],  [24],  None  of  these  techniques  or  the  ones 
summarized  in  this  paper  present  in  themselves  a  complete 
security  solution,  and  we  have  not  attempted  to  implement 
any  of  our  security  techniques  However,  as  network  coding 
opens  entirely  new  venues  for  the  operation  of  networks,  we 
expect  to  see  security  challenges  inherited  from  traditional 


forms  of  networking,  the  mitigation  of  current  problems  but 
also  the  emergence  of  new  classes  of  data  sharing  problems  in 
Net-Centnc  environment  We  will  further  develop  scalable  and 
secure  network  coding  techniques  to  solve  multimedia  delivery 
and  massive  data  sharing  problems  in  Airbome/ITAV  networks. 
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Abstract — 

Network  coding  substantially  increases  network  throughput. 
But  since  it  involves  mixing  of  information  inside  the  network, 
a  single  corrupted  packet  generated  by  a  malicious  node  can 
end  up  contaminating  all  the  Information  reaching  a  destination, 
preventing  decoding. 

This  paper  Introduces  the  first  distributed  polynomial-time  rate- 
optimal  network  codes  that  work  in  the  presence  of  Byzantine 
nodes.  We  present  algorithms  that  target  adversaries  with  different 
attacking  capabilities.  When  the  adversary  can  eavesdrop  on  ail 
links  and  jam  zo  links  ,  our  first  algorithm  achieves  a  rate  of 
C  —  2 zoy  where  C  Ls  the  network  capacity.  In  contrast,  when  the 
adversary  has  limited  snooping  capabilities,  we  provide  algorithms 
that  achieve  the  higher  rate  of  C  -  zo. 

Our  algorithms  attain  the  optimal  rate  given  the  strength  of 
the  adversary.  They  are  infomiation-theoretically  secure.  They 
operate  in  a  distributed  manner,  assume  no  knowledge  of  the 
topology,  and  can  he  designed  and  implemented  in  polynomial- 
time.  Furthermore,  only  the  source  and  destination  need  to  be 
modified;  non-maliclous  nodes  inside  the  network  are  oblivious  to 
the  presence  of  adversaries  and  implement  a  classical  distributed 
network  code.  Finally,  our  algorithms  work  over  wired  and  wireless 
networks. 

I.  Introduction 

Network  coding  allows  the  routers  to  mix  the  information 
content  in  packets  before  forwarding  them.  This  mixing  has 
been  theoretically  proven  to  maximize  network  throughput  [I], 
[19],  [13].  It  can  be  done  in  a  distributed  manner  with  low  com¬ 
plexity,  and  is  robust  to  packet  losses  and  network  failures  [8], 
[23].  Furthermore,  recent  implementations  of  network  coding 
for  wired  and  wireless  environments  demonstrate  its  practical 
benefits  [16],  [6]. 

But  what  if  the  network  contains  malicious  nodes?  A  ma¬ 
licious  node  may  pretend  to  forward  packets  from  source  to 
destination,  while  in  reality  it  injects  corrupted  packets  into 
the  information  flow'.  Since  network  coding  makes  the  routers 
mix  packets’  content,  a  single  corrupted  packet  can  end  up 
corrupting  all  the  information  reaching  a  destination.  Unless  this 
problem  is  solved,  network  coding  may  perform  much  worse 
than  pure  forwarding  in  the  presence  of  adversaries. 

The  interplay  of  network  coding  and  Byzantine  adversaries 
has  been  examined  by  a  few  recent  papers.  Some  detect  the 
presence  of  an  adversary  [10],  others  correct  the  errors  he  injects 
into  the  codes  under  specific  conditions  [7],  [12],  [20],  and  a 
few  bound  the  maximum  achievable  rate  in  such  adverse  envi¬ 
ronments  [3],  [29].  But  attaining  optimal  rates  using  distributed 
and  low -complexity  codes  is  still  an  open  problem. 

This  paper  designs  distributed  polynomial-time  rate-optimal 
network  codes  that  combat  Byzantine  adversaries.  We  present 
three  algorithms  that  target  adversaries  with  different  strengths. 
The  adversary  can  inject  zq  packets  per  unit  time,  but  his 


listening  power  varies.  When  the  adversary  is  omniscient,  i.e.,  he 
observes  transmissions  on  the  entire  network,  our  codes  achieve 
the  rate  of  C-2zo ,  with  high  probability.  When  the  adversary’s 
knowledge  is  limited,  either  because  he  eavesdrops  only  on  a 
subset  of  the  links  or  the  source  and  destination  have  a  low-rate 
secret-channel,  our  algorithms  deliver  the  higher  rate  of  C  —  zo- 

The  intuition  underlying  all  of  our  algorithms  is  that  the 
aggregate  packets  from  the  adversarial  nodes  can  be  thought 
of  as  a  second  source.  The  information  received  at  the  desti¬ 
nation  is  a  linear  transform  of  the  source’s  and  the  adversary’s 
information.  Given  enough  linear  combinations  (enough  coded 
packets),  the  destination  can  decode  both  sources.  The  question 
however  is  how  does  the  destination  distill  out  the  source’s 
information  from  the  received  mixture.  To  do  so,  the  source’s 
information  has  to  satisfy  certain  constraints  that  the  attacker’s 
data  cannot  satisfy.  This  can  be  done  by  judiciously  adding 
redundancy  at  the  source.  For  example,  the  source  may  add 
redundancy  to  ensure  that  certain  functions  evaluate  to  zero 
on  the  original  source’s  data,  and  thus  can  be  used  to  distill 
the  source’s  data  from  the  adversary’s.  The  challenge  addressed 
herein  is  to  design  the  redundancy  that  achieves  the  optimal 
rates. 

This  paper  makes  several  contributions.  The  algorithms 
presented  herein  arc  the  first  distributed  algorithms  with 
polynomial-time  complexity  in  design  and  implementation,  yet 
are  rate-optimal.  In  fact,  since  pure  forwarding  is  a  special 
case  of  network  coding,  being  rate-optimal,  our  algorithms  also 
achieve  a  higher  rate  than  any  approach  that  docs  not  use 
network  coding.  They  assume  no  knowledge  of  the  topology 
and  work  in  both  wired  and  wireless  networks.  Furthermore, 
implementing  our  algorithms  involves  only  a  slight  modification 
of  the  source  and  destination  while  the  internal  nodes  can 
continue  to  use  standard  network  coding. 

11.  Illustrating  Example 

We  illustrate  the  intuition  underlying  our  approach  using 
the  toy  example  in  Fig.  1.  Calvin  wants  to  prevent  the  flow 
of  information  from  Alice  to  Bob,  or  at  least  minimize  iL  All 
links  have  a  capacity  of  one  packet  per  unit  time.  Further,  Calvin 
connects  to  the  three  routers  through  an  intermediate  node.  The 
intermediate  node  just  relays  all  the  packets  Calvin  sends  him 
to  the  three  routers.  The  network  capacity,  C,  is  by  definition 
the  min-cut  from  Alice  to  Bob.  It  is  equal  to  3  packets  per  unit 
time.  The  min-cut  from  Calvin  to  the  destination  is  zo  =  1 
packet  per  unit  time.  Hence,  the  maximum  rate  from  Alice  to 
Bob  in  this  scenario  is  bounded  by  C  —  zq  =  2  packets  per  unit 
time  as  proven  in  [12]. 
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Fig.  1— A  simple  example.  Alice  transmits  to  Bob.  Calvin  injects 
corrupted  packets  into  their  communication.  The  grey  nodes  in 
the  middle  perform  network  coding. 

We  express  each  packet  as  a  vector  of  n  bytes,  where  n  is 
a  sufficiently  large  number.  The  routers  create  random  linear 
combinations  of  the  packets  they  receive.  Hence,  every  unit  of 
time  Bob  receives  the  packets: 

y,  =  a*xi+ft§,  i  €  {1,2,3},  (1) 

where  x-,’s  are  vectors  representing  the  three  packets  Alice  sent, 
z  is  the  packet  Calvin  sent,  and  ft  are  random  coefficients. 

In  our  example,  the  routers  operate  over  bytes;  the  ith  byte 
in  an  outgoing  packet  is  a  linear  combination  of  ith  bytes  in  the 
incoming  packets.  Thus,  (I)  also  describes  the  relation  between 
the  individual  bytes  in  yi's  and  the  corresponding  bytes  in  xj's 
and  z. 

Since  the  routers  mix  the  content  of  the  packets,  Alice  cannot 
just  sign  her  packets  and  have  Bob  discard  all  packets  with 
incorrect  signatures.  To  decode.  Bob  has  to  somehow  distill  the 
xj's  from  the  yi’s  he  receives. 

As  a  first  attempt  at  solving  the  problem,  let  us  assume  that 
Bob  knows  the  topology,  i.e.,  he  knows  that  the  packets  he 
receives  are  produced  using  (1).  Further,  let  us  assume  that  he 
knows  the  random  coefficients  used  by  the  routers  to  code  the 
packets,  i.e.,  he  knows  the  values  of  a,'s  and  ft’s.  To  decode. 
Bob  has  to  solve  (1).  Since  each  packet  contains  n  bytes,  the 
system  in  ( 1 )  represents  3 n  equations,  one  equation  per  received 
byte.  Bob  has  3n  equations  and  4n  unknowns  (n  unknown  bytes 
per  each  packet  z,  xi,  x3  and  x3).  Hence,  Bob  cannot  decode. 

To  address  the  above  situation,  Alice  needs  to  add  redun¬ 
dancy  to  her  transmitted  packets.  After  all,  as  noted  above,  for 
the  particular  example  in  Fig.  I,  Alice’s  rate  is  bounded  by  2 
packets  per  unit  time.  Thus,  Alice  should  send  no  more  than  2 
packets  worth  of  information.  She  can  use  the  third  packet  for 
added  redundancy.  Suppose  Alice  sets 

x3  =  xi  *F  x2.  (2) 

This  coding  strategy  is  public  to  both  Bob  and  Calvin.  Since 
each  packet  contains  n  bytes,  combining  (2)  with  (1).  Bob 
obtains  a  system  of  4n  equations  with  An  unknowns,  which 
he  can  solve  to  decode. 

But  in  the  general  case.  Bob  knows  nothing  about  the 
coefficients  used  by  the  routers,  the  topology,  or  the  overall 
network  transform.  Said  differently,  the  6  coefficients  corre¬ 
sponding  to  the  Qj’s  and  the  ft’s  are  usually  unknown  to  Bob. 
Thus,  given  (I)  and  (2),  Bob  is  faced  with  An  equations  and 
4n+6  unknowns,  and  thus  cannot  decode.  The  matter  is  further 
complicated  by  the  non-linearity  of  (1),  which  involves  the 
product  of  unknown  terms  a<xi  and  ftz. 

The  first  idea  we  exploit  in  our  solution  is  that  while  z  is 
a  whole  unknown  packet  of  n  bytes,  each  of  the  coefficients 


ft  is  a  single  byte.  Thus,  instead  of  devoting  a  whole  vector 
of  n  bytes  for  added  redundancy  (as  in  (2)),  Alice  just  needs 
to  introduce  6  extra  bytes  of  redundancy  to  compensate  for  the 
Qi’s  and  ft’s  being  unknown. 

Alice  imposes  constraints  on  her  data  to  help  Bob  to  decode. 
For  instance,  a  simple  constraint  could  be  that  the  first  byte  in 
each  packet  equals  zero.  This  constraint  provides  Bob  with  2 
additional  equations  (recall  that  the  first  byte  in  x3  is  forced  to 
0  due  to  (2).  and  hence  the  new  constraint  produces  2  additional 
equations  rather  than  3).  Rewriting  (I)  for  the  first  byte  of  each 
packet,  we  obtain: 

VU  =  +  ft*i  =  ft*i,  ,t  6  {1,2,3}  (3) 

where  denotes  the  jth  byte  in  the  ith  received  packet.  The 
above  equations  provide  Bob  with  a  scaled  version  of  the  ft’s, 
i.e.,  they  are  all  multiplied  by  z\. 

Our  second  observation  is  that  the  scaled  version  of  the  ft's 
suffices  for  Bob  to  decode  5L  This  can  be  seen  by  a  simple 
algebraic  manipulation  of  (1).  Bob  can  rewrite  the  equations 
in  (1)  by  multiplying  and  dividing  the  second  term  with  z\  and 
appending  (2)  to  obtain 

yi  =  a4xi+(ftzi)(i/*i),  ie  {1,2,3}  .  (4) 

Notice  that  Bob  already  knows  all  three  ftzj  terms  from  (3). 
The  term  {i/z\)  can  be  considered  a  single  unknown  because 
Bob  does  not  care  about  estimating  the  exact  value  of  z. 

To  allow  Bob  to  discover  the  a,’s.  Alice  similarly  adds  4 
more  bytes  of  redundancy  by  imposing  constraints  on  the  second 
and  third  bytes  in  her  packets.  For  example,  she  chooses  xit2  = 
X2,2  =  1  and  xi,3  =  —  X2,3  =  1  (combined  with  (2),  these 
constraints  force  x3,2  =  2  and  x3f3  =  0).  Substituting  the  values 
of  (ft^i).  ((hz i)  and  (ftj*i)  from  (3)  gives  Bob  the  following 
equations. 

1/1.2  =  <*  i  +  yi.i  (x2/xi ),  yi,3  =  +  yiti(W2i) 

1/2.2  =  «2  +  1/2,1  (*2 At )i  S/2,3  =  -OC2  +  y2,l(W*l)  •  (5) 

S/3,2  =  2a3  +  y3,l(*2/*l)}  1/3,3  =  S/3.l(V*l) 

Now  Bob  has  6  linear  equations  with  the  5  unknowns  qi, 
<*2>tt3.  x2/xi  and  33/21,  and  they  can  be  solved  to  obtain  the 
a*'s.  Hence  we  are  essentially  back  to  the  situation  where  Bob 
knows  the  c*i's  and  ft’s,  and  can  solve  for  xj’s. 

One  complication  still  remains.  If  Calvin  knows  the  con¬ 
straints  on  Alice's  data,  he  will  try  to  assign  values  to  his  bytes 
to  prevent  Bob  from  decoding.  For  example,  if  Calvin  knows 
that  the  first  byte  of  each  of  Alice's  packets  is  zero,  he  too 
would  set  the  first  byte  in  his  packet  Z\  to  zero,  in  which  case 
Bob  does  not  obtain  any  information  about  the  ft’s  from  (4). 

There  are  two  ways  out  of  this  situation.  Suppose  Alice 
could  communicate  to  Bob  a  small  message  that  is  secret  from 
Calvin.  In  this  case,  she  could  compute  a  small  number  of 
hashes  of  her  data,  and  transmit  them  to  Bob.  These  hashes 
correspond  to  constraints  on  her  data,  which  enables  Bob  to 
decode.  If  Alice  cannot  communicate  secretly  with  Bob,  she 
leverages  the  fact  that  Calvin  can  inject  only  one  fake  packet 
Since  Calvin's  packet  is  n  bytes  long,  he  can  cancel  out  at  most 
n  hashes.  If  Alice  injects  n  +  1  hashes,  there  must  be  at  least 
one  hash  Calvin  cannot  cancel.  This  hash  enables  Bob  to  find 
the  ft’s  and  decode.  Notice,  however,  that  the  n  4- 1  additional 
constraints  imposed  on  the  bytes  in  x\  and  i? 2  mean  that  Alice 
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can  only  transmit  at  most  n  -  1  bytes  of  data  to  Bob.  For  a 
large  number  of  bytes  n  in  a  packet,  this  rate  is  asymptotically 
optimal  against  an  all-knowing  adversary  [3]. 

After  giving  some  intuition  on  how  our  scheme  works,  the 
rest  of  this  paper  considers  the  general  problem  of  network 
coding  over  completely  unknown  topology,  in  the  presence  of 
an  adversary  who  has  partial  or  full  knowledge  of  the  network 
and  transmissions  in  it. 

111.  Related  Work 

We  start  with  a  brief  summary  of  network  coding,  followed 
by  a  survey  of  prior  work  on  Byzantine  adversaries  in  networks. 

A.  Network  Coding  Background 

Work  on  network  coding  started  with  a  pioneering  paper 
by  Ahlswcde  et  al.  [1],  which  establishes  the  value  of  coding 
in  the  routers  and  provides  theoretical  bounds  on  the  capacity 
of  such  networks.  The  combination  of  [21],  [19],  [13J  shows 
that,  for  multicast  traffic,  linear  codes  achieve  the  maximum 
capacity  bounds,  and  coding  and  decoding  can  be  done  in 
polynomial  time.  Additionally,  Ho  et  al.  show  that  the  above 
is  true  even  when  the  routers  pick  random  coefficients  [8]. 
Researchers  have  extended  the  above  results  to  a  variety  of 
areas  including  wireless  networks  [23],  [15],  [16],  energy  [28). 
secrecy  [2],  content  distribution  [6],  and  distributed  storage  [14). 

B.  Byzantine  Adversaries  in  Networks 

A  Byzantine  attacker  is  a  malicious  adversary  hidden  in 
a  network,  capable  of  eavesdropping  and  jamming  commu¬ 
nications.  Prior  research  has  examined  these  attacks  in  the 
presence  of  network  coding  and  without  it.  In  the  absence 
of  network  coding,  Dolev  et  al.  [5]  consider  the  problem 
of  communicating  over  a  known  graph  containing  Byzantine 
adversaries.  They  show  that  for  k  adversarial  nodes,  reliable 
communication  is  possible  only  if  the  graph  has  more  than 
2k  +  1  vertex  connectivity.  Subramaniam  extends  this  result 
to  unknown  graphs  [26].  Pelc  cl  al.  address  the  same  problem 
in  wireless  networks  by  modeling  malicious  nodes  as  locally 
bounded  Byzantine  faults,  i.e.,  nodes  can  overhear  and  jam 
packets  only  in  their  neighborhood  [24). 

The  interplay  of  network  coding  and  Byzantine  adversaries 
was  first  examined  in  [10).  which  delects  the  existence  of  an 
adversary  but  docs  not  provide  an  error-correction  scheme. 
This  has  been  followed  by  the  work  of  Cai  and  Yeung  [29], 
[3],  who  generalize  standard  bounds  on  error-correcting  codes 
to  networks,  without  providing  any  explicit  algorithms  for 
achieving  these  bounds.  Our  w  ork  presents  a  constructive  design 
to  achieve  those  bounds. 

The  problem  of  correcting  errors  in  the  presence  of  both 
network  coding  and  Byzantine  adversaries  has  been  considered 
by  a  few  prior  proposals.  Earlier  work  [20],  [7)  assumes  a 
centralized  trusted  authority  that  provides  hashes  of  the  original 
packets  to  each  node  in  the  network.  More  recent  work  by 
Charles  et  al.  [4]  obviates  the  need  for  a  trusted  entity  under  the 
assumption  that  the  majority  of  packets  received  by  each  node 
is  uncorrupted.  In  contrast  to  the  above  two  schemes  which  are 
cryptographically  secure,  in  a  previous  work  [12],  we  consider 
an  information-theoretically  rate-optimal  solution  to  Byzantine 
attacks  for  wired  networks,  w  hich  however  requires  a  centralized 
design.  This  paper  builds  on  the  above  prior  schemes  to  combine 


Scheme 

Charles  et.al.  [4] 

Jaggi  et.al.  [12] 

Ours 

Info.  Theoretic  Security 

No 

Yes 

Yes 

Distributed 

~Ycs 

No 

Yes 

Internal  Node  Complexity 

High 

Low 

Low 

Decoding  Complexity 

High 

Exponential 

Low 

General  Graphs 

Ro 

Yes 

Yes 

Universal 

No 

“  No" 

Yes 

TABLE  I— Comparison  between  the  results  in  this  paper  and  some 
prior  papers. 

their  desirable  traits;  it  provides  a  distributed  solution  that  is 
information-theoretically  rate  optimal  and  can  be  designed  and 
implemented  in  polynomial  time.  Furthermore,  our  algorithms 
have  new  features;  they  assume  no  knowledge  of  the  topology, 
do  not  require  any  new  functionality  at  internal  nodes,  and 
work  for  both  wired  and  wireless  networks.  Recent  work  [17] 
has  considered  the  same  problem  from  a  different  perspective, 
their  results  and  bounds  are  similar  to  ours.  Table  1  highlights 
similarities  and  differences  from  prior  work. 

IV.  Model  &  Definitions 

We  use  a  general  model  that  encompasses  both  wired  and 
wireless  networks.  To  simplify  notation,  we  consider  only  the 
problem  of  communicating  from  a  single  source  to  a  single 
destination.  But  similar  to  most  network  coding  algorithms,  our 
techniques  generalize  to  multicast  traffic. 

A.  Threat  Model 

There  is  a  source,  Alice,  and  a  destination,  Bob,  who 
communicate  over  a  wired  or  wireless  network.  There  is  also  an 
attacker  Calvin,  hidden  somewhere  in  the  network.  Calvin  aims 
to  prevent  the  transfer  of  information  from  Alice  to  Bob,  or  at 
least  to  minimize  it.  He  can  observe  some  of  the  transmissions, 
and  can  inject  his  own.  When  he  injects  his  own  packets,  he 
pretends  they  are  part  of  the  information  flow  from  Alice  to 
Bob. 

Calvin  is  quite  strong.  He  is  computationally  unbounded.  He 
knows  the  encoding  and  decoding  schemes  of  Alice  and  Bob. 
and  the  network  code  implemented  by  the  interior  nodes.  He 
also  knows  the  exact  network  realization. 

B.  Network  and  Code  Model 

This  section  describes  the  network  model,  the  packet  format, 
and  how  the  network  transforms  the  packets. 

Network  Model:  The  network  is  modeled  as  a  hypergraph  [22]. 
Each  packet  transmission  corresponds  to  a  hyperedge  directed 
from  the  transmitting  node  to  the  set  of  observer  nodes.  The 
hypergraph  model  captures  both  wired  and  wireless  networks. 
For  wired  networks,  the  hyperedge  is  a  simple  point-to-point 
link.  For  wireless,  each  such  hyperedge  is  determined  by 
instantaneous  channel  realizations  (packets  may  be  lost  due  to 
fading  or  collisions)  and  connects  the  transmitter  to  all  nodes 
that  hear  the  transmission.  The  hypergraph  is  unknown  to  Alice 
and  Bob  prior  to  transmission. 

Source:  Alice  generates  incompressible  data  that  she  wishes 
to  deliver  to  Bob  over  the  network.  To  do  so.  Alice  encodes 
her  data  as  dictated  by  the  encoding  algorithm  (described  in 
subsequent  sections).  She  divides  the  encoded  data  into  batches 
of  b  packets.  For  clarity,  we  focus  on  the  encoding  and  decoding 
of  one  batch. 
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A  packet  contains  a  sequence  of  n  symbols  from  the  finite 
field  Fq.  All  arithmetic  operations  henceforth  are  done  over 
symbols  from  Fq.  (See  the  treatment  in  [18]).  Out  of  the  n 
symbols  in  Alice's  packet,  6n  symbols  are  redundancy  added 
by  the  source. 

Alice  organizes  the  data  in  each  batch  into  a  matrix  Ar  as 
shown  in  Fig.  2.  We  denote  the  element  in  the  matrix  by 

x(i}j).  The  ittl  row  in  the  matrix  A'  is  just  the  i,fl  packet  in  the 
batch.  Fig.  2  shows  that  similarly  to  standard  network  codes  [8], 
some  of  the  redundancy  in  the  batch  is  devoted  to  sending  the 
identity  matrix,  /.  Also,  as  in  [8],  Alice  lakes  random  linear 
combinations  of  the  rows  of  A"  to  generate  her  transmitted 
packets.  As  the  packets  traverse  the  network,  the  internal  nodes 
apply  a  linear  transform  to  the  batch.  The  identity  matrix 
receives  the  same  linear  transform.  The  destination  discovers 
the  linear  relation  between  the  packets  it  receives  and  those 
transmitted  by  inspecting  how  /  was  transformed. 

Adversary:  Let  the  matrix  Z  be  the  information  Calvin  injects 
into  each  batch.  The  size  of  this  matrix  is  Zo  *  n,  where  zq  is 
the  size  of  the  min-cut  from  Calvin  to  the  destination. 


n  -  packet  size 


Mi  Ji 


B-  Batch  Size 


6 n  -  redundant  symbols 
n  -  packet  size 


|  j  j  ^  No.  of  packets 


n  -  packet  size 


Calvin  injects 


C-  Network  Capacity 


Fig.  2 — 

Alice,  Bob  and  Calvin’s  information  matrices. 

Variable 

Definition 

b 

Number  of  packets  in  a  batch 

zo 

Number  of  packets  Calvin  can  inject. 

Zt 

Number  of  packets  Calvin  can  hear. 

n 

Length  of  each  packet. 

Fractional  redundancy  introduced  by  Alice. 

T 

Proxy  of  the  transfer  matrix  T  representing  the 
network  transform 

TABLE  II — Terms  used  in  the  paper. 


Destination:  Analogously  to  how  Alice  generates  A',  the  des¬ 
tination  Bob  organizes  the  received  packets  into  a  matrix  V'. 
The  tth  received  packet  corresponds  to  the  ith  row'  of  Y.  Note 
that  the  number  of  received  packets,  and  therefore  the  number 
of  rows  of  K,  is  a  variable  dependent  on  the  network  topology. 
The  column  rank  of  Y%  however,  is  b  +  zo .  Bob  attempts  to 
reconstruct  Alice's  information.  A',  using  the  matrix  of  received 
packets  Y. 

C.  Definitions 

We  define  the  following  concepts. 

•  The  network  capacity ,  denoted  by  C,  is  the  time-average 
of  the  maximum  number  of  packets  that  can  be  delivered 
from  Alice  to  Bob,  assuming  no  adversarial  interference, 
i.e.,  the  max  flow.  It  can  be  also  expressed  as  the  min-cut 
from  source  to  destination.  (For  the  corresponding  multicast 
case,  C  is  defined  as  the  minimum  of  the  min-cuts  over  all 
destinations.) 

•  The  error  probability  is  the  probability  that  Bob's  recon¬ 
struction  of  Alice's  information  is  inaccurate. 

•  The  rate,  /?,  is  the  number  of  information  bits  in  a  batch 
amortized  by  the  length  of  a  packet  in  bits. 

•  The  rate  R  is  said  to  be  achievable  if  for  any  i  >  0,  any 
6  >  0,  and  sufficiently  large  n,  there  exists  a  block-length-n 
network  code  with  a  redundancy  6  and  a  probability  of  error 
less  than  e. 

•  A  code  is  said  to  be  universal  if  the  code  design  is  indepen¬ 
dent  of  zo- 

V.  Network  Transform 

This  section  explains  how  Alice's  packets  get  transformed 
as  they  travel  through  the  network.  It  examines  the  effect  the 
adversary  has  on  the  received  packets,  and  Bob’s  decoding 
problem. 

The  network  performs  a  classical  distributed  network 
code  [8].  Specifically,  each  packet  transmitted  by  an  internal 
node  is  a  random  linear  combination  of  its  incoming  packets. 


Thus,  the  effect  of  the  network  at  the  destination  can  be 
summarized  as  follows. 

Y  =  TX  +  7WZ,  (6) 

where  A'  is  the  batch  of  packets  sent  by  Alice,  Z  refers  to  the 
packets  Calvin  injects  into  Alice’s  batch,  and  V'  is  the  received 
batch.  The  variable  T  refers  to  the  linear  transform  from  Alice 
to  Bob,  while  Tz-~y  refers  to  the  linear  transform  from  Calvin 
to  Bob. 

As  explained  in  §IV,  a  classical  random  network  code's  A' 
includes  the  identity  matrix  as  part  of  each  batch.  The  identity 
matrix  sent  by  Alice  incurs  the  same  transform  as  the  rest  of 
the  batch.  Thus, 

T  =  77  -f  7z— y  L,  (7) 

where  T  and  L  are  the  columns  corresponding  to  /’s  location 
in  Y  and  Z  respectively,  as  shown  in  Fig.  2. 

In  standard  network  coding,  there  is  no  adversary,  i.e.,  Z  = 
0  and  L  =  0,  and  thus  T  =  T.  The  destination  receives  a 
description  of  the  network  transform  in  T  and  can  decode  X  as 
T"*1  Y.  In  the  presence  of  the  adversary,  however,  the  destination 
needs  to  solve  (6)  and  (7)  to  extract  the  value  of  X. 

By  substituting  T  from  (7),  (6)  can  be  simplified  to  get 

Y  =  fX  +  Tz-*y{Z  -  LX)  (8) 

=  TX  +  E,  (9) 

where  E  is  a  C  x  n  matrix  that  characterizes  Calvin's  interfer¬ 
ence.  Note  that  the  matrix  T,  which  Bob  knows,  acts  as  a  proxy 
transfer  matrix  for  T,  which  he  doesn't  know. 

Note  that  in  (6),  all  terms  other  than  Y  are  unknown.  Further, 
it  is  non-linear  due  to  the  cross-product  terms,  TX  and  T z-~y%. 
In  contrast,  (9)  is  linear  in  the  unknowns  A"  and  E.  The  rest 
of  this  work  focuses  on  solving  (9)  under  different  assumptions 
on  Calvin's  strength. 

VI.  Summary  of  Results 

We  have  three  main  results.  Each  result  corresponds  to  a 
distributed,  rate-optimal,  polynomial-time  algorithm  that  defeats 
an  adversary  of  a  particular  type.  The  optimality  of  these  rates 
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has  been  proven  by  prior  work  [3],  [29],  [12].  Our  work, 
however,  provides  a  construction  of  distributed  codes/algorithms 
that  achieve  optimal  rates.  In  what  follows,  let  |T|  denote  the 
number  of  receivers,  and  |£|  denote  the  number  of  transmissions 
in  the  network. 

(1)  Shared  Secret  Model:  This  model  assumes  that  Alice  and 
Bob  have  a  very  low  rate  secret  channel,  the  transmissions  on 
which  are  unknown  to  Calvin.  It  considers  the  transmission  of 
information  via  network  coding  in  a  network  where  Calvin  can 
observe  all  transmissions,  and  can  inject  some  corrupt  packets. 

Theorem  l :  The  Shared  Secret  algorithm  achieves  a  rate  of 
C  -  zo  with  code-complexity  0{nC2).  This  is  the  maximum 
achievable  rate. 

In  §VU,  we  prove  the  above  theorem  by  constructing  an  algo¬ 
rithm  that  achieves  the  bounds.  Note  that  [7]  proves  a  similar 
result  for  a  more  constrained  model  where  Alice  shares  a  very 
low  rate  secret  channel  with  all  nodes  in  the  network,  and  the 
operations  performed  by  internal  nodes  arc  computationally  ex¬ 
pensive.  Further,  their  result  guarantees  cryptographic  security, 
while  we  provide  information-theoretic  security. 


entry  equals  (rj)\  i.e.,  rj  to  the  ith  power.  The  second  part  of 
Alice’s  secret  message  is  the  bx  C  hash  matrix  //,  computed  as 
the  matrix  product  XP .  We  assume  Alice  communicates  both 
the  set  of  parity  symbols  and  the  hash  matrix  H  to  Bob  over 
the  secret  channel.  The  combination  of  these  two  creates  the 
shared  secret,  denoted  S,  between  Alice  and  Bob.  The  size  of 
S  is  C(6+ 1)  symbols,  which  is  small  in  comparison  to  Alice’s 
information  X.  (The  size  of  X  is  bxn;  it  can  be  made  arbitrarily 
large  compared  to  the  size  of  S  by  increasing  the  packet  size 
n.) 

Alice’s  Encoder:  Alice  implements  the  classical  random  net¬ 
work  encoder  described  in  §IV-B. 

Bob’s  Decoder:  Not  only  is  P  used  by  Alice  to  generate  //, 
but  is  also  used  by  Bob  in  his  decoding  process.  To  be  more 
precise.  Bob  computes  YP  —  TH  using  the  messages  he  gets 
from  the  network  and  the  secret  channel.  We  call  the  outcome 
the  syndrome  tnatrix  S. 

By  substituting  the  value  of  //  and  using  (9),  we  obtain 

S  —  YP  —  t H  =  (Y-  TX)P  =  EP.  (10) 


(2)  Omniscient  Adversary’  Model:  This  model  assumes  an 
omniscient  adversary,  i.e.,  one  from  whom  nothing  is  hidden.  In 
particular.  Alice  and  Bob  have  no  shared  secrets  hidden  from 
Calvin.  It  also  assumes  that  the  min-cut  from  the  adversary 
to  the  destination,  Zo .  is  less  than  C/2.  Prior  work  proves 
that  without  this  condition,  it  is  impossible  for  the  source 
and  the  destination  to  reliably  communicate  without  a  secret 
channel  [12].  In  §VI!I,  we  prove  the  following. 

Theorem  2:  The  Omniscient  Adversary  algorithm  achieves 
a  rate  of  C  —  2 zo  with  code-complexity  0((nC)3).  This  is  the 
maximum  achievable  rate. 

(3)  Limited  Adversary'  Model:  In  this  model,  Calvin  is  limited 
in  his  eavesdropping  power;  he  can  observe  at  most  2/  transmit¬ 
ted  packets.  Exploiting  this  weakness  of  the  adversary  results 
in  an  algorithm  that,  like  the  Omniscient  Adversary  algorithm 
operates  without  a  shared  secret,  but  still  achieves  the  higher  rate 
possible  via  the  Shared  Secret  algorithm.  In  particular,  in  SIX 
we  prove  the  following. 

Theorem  3:  If  Zj  <  C  —  2 zo.  the  Limited  Adversary 
algorithm  achieves  a  rate  of  C  -  zo  with  code-complexity 
G(nC2),  This  is  the  maximum  achievable  rate. 

vii.  Shared  Secret  Model 

In  the  Shared  Secret  model.  Alice  and  Bob  have  use  of  a 
strong  resource,  namely  a  secret  channel  over  which  Alice  can 
transmit  a  small  amount  of  information  to  Bob  that  is  secret 
from  Calvin.  Note  that  since  the  internal  nodes  mix  corrupted 
and  uncomipted  packets,  Alice  cannot  just  sign  her  packets 
and  have  Bob  check  the  signature  and  throw  away  corrupted 
packets,  in  extreme  cases  this  might  lead  to  Bob  not  receiving 
any  uncorrupted  packets.  Alice  uses  the  secret  channel  to  send  a 
hash  of  her  information  X  to  Bob,  which  Bob  can  use  to  distill 
the  corrupted  packets  he  receives,  as  explained  below. 

Shared  Secret:  Alice  generates  her  secret  message  in  two 
steps.  She  first  chooses  C  parity  symbols  uniformly  at  random 
from  the  field  ¥Q.  The  parity  symbols  arc  labeled  r</,  for  d  € 
{ 1 , . . . ,  C}.  Corresponding  to  the  parity  symbols.  Alice’s  parity - 
check  matrix  P  is  defined  as  the  n  x  C  matrix  whose  {i,j)th 


Thus,  if  no  adversary  was  present,  the  packets  would  not  be 
corrupted  (i.e.,  E  =  0)  and  S  would  be  an  all-zero  matrix.  As 
shown  in  §IV,  X  then  equals  T~lY.  If  Calvin  injects  corrupt 
packets,  S  will  be  a  non-zero  matrix. 

Claim  I:  The  rank  of  E  is  at  most  zo . 

Claim  2:  The  columns  of  5  span  the  same  vector-space  as 
the  columns  of  E  with  probability  at  least  1  —  Cncq~l. 

Claim  I  follows  from  the  definition  of  E  =  Tz^y(Z  —  LX). 
Claim  2  is  proved  in  the  Appendix.  Together,  they  imply  that 
Calvin’s  interference,  E,  can  be  written  as  linear  combinations 
of  the  columns  of  aCx:o  submatrix  S'  of  5,  i.e.,  E  =  S'  A, 
where  A  is  a  zo  x  n  matrix.  This  enables  Bob  to  rewrite  (9)  as 
the  matrix  product 


V  =  [t  5'] 


X 

A 


]■ 


(ID 


Bob  does  not  care  about  A,  but  to  obtain  X .  he  must  solve 
(II).  Let  |T|  and  |£|  be  the  number  of  terminals  and  links  in 
the  underlying  network. 

Claim  3:  The  matrix  [T  7z_y],  and  thus  the  matrix  [ T  5'], 
has  full  column-rank  with  probability  at  least  1  —  | T\\£\q~x . 
Claim  3,  proved  in  the  Appendix,  means  that  Bob  can  decode 
by  simply  inverting  the  C  x  C  matrix  [T  5']  and  multiplying 
the  result  by  Y.  Thus,  the  shared  secret  algorithm  achieves  the 
rate  of  C  —  zo  —  b2/n.  Here,  the  asymptotically  negligible  term 
b2/n  corresponds  to  the  overhead  due  to  the  identity  matrix 
Alice  appends  to  X.  This  rate  is  shown  to  be  optimal  by  prior 
work  [12].  The  probability  of  error  is  at  most  the  sums  of  the 
probabilities  of  error  in  Claims  2  and  3,  i.e.,  (ncC+|T||£|)g~l. 
Of  code  design,  encoding  and  decoding,  both  encoding  and 
decoding  require  0(nC2)  steps.  The  costliest  step  for  Alice 
is  the  computation  of  the  hash  matrix  //.  and  for  Bob  is  the 
computation  of  the  syndrome  matrix  S. 

The  scheme  presented  above  is  universal,  i.e.,  the  parameters 
of  the  code  do  not  depend  on  any  knowledge  about  zo *  which  in 
some  sense  functions  as  the  “noise  parameter”  of  the  network. 
Alice  therefore  has  flexibility  in  tailoring  her  batch  size  to  the 
size  of  the  data  which  she  wishes  to  transmit  and  the  packet 
size  allowed  by  the  network.  □ 
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VIII.  Omniscient  Adversary  Model 

What  if  we  face  an  omniscient  adversary .  i.c.,  Calvin  can 
observe  everything,  and  there  are  no  shared  secrets  between 
Alice  and  Bob?  We  design  a  network  error-correcting  code  to 
defeat  such  a  powerful  adversary.  Our  algorithm  achieves  a  rate 
of  R  =  C— 2zo*  which  is  lower  than  in  the  Shared  Secret  model. 
This  is  a  direct  consequence  of  Calvin's  increased  strength. 
Recent  bounds  [3]  on  network  error-correcting  codes  show  that 
in  fact  C  -  2 zq  is  the  maximum  achievable  rate  for  networks 
with  an  omniscient  adversary. 

Alice’s  Encoder:  Alice  encodes  in  two  steps.  To  counter 
the  adversary's  interference,  she  first  generates  X  by  adding 
redundancy  to  her  information.  She  then  encodes  X  using  the 
encoder  defined  in  §IV-B. 

Alice  adds  redundancy  as  follows.  Her  original  information 
is  a  lcngth-(6rz  —  Sn—br2)  column  vector  U.  (Here  the  fractional 
redundancy  <5,  is  dependent  on  zo ,  the  number  of  packets  Calvin 
may  inject  into  the  network.)  Alice  converts  U  into  X,  a  length¬ 
en  vector  R  ,  where  I  is  just  the  column  version  of 
the  6  x  6  identity  matrix.  It  is  generated  by  stacking  columns 
of  the  identity  matrix  one  after  the  other.  The  second  term,  R. 
represents  the  redundancy  Alice  adds.  The  redundancy  vector 
R  is  a  length -Jn  column  vector  generated  by  solving  the  matrix 
equation  for  R. 


d(u 


R 


-\T 

l)  =0. 


where  D  is  a  5n  x  bn  matrix  defined  as  the  redundancy  matrix. 
D  is  obtained  by  choosing  each  element  as  an  independent 
and  uniformly  random  symbol  from  the  finite  field  ¥q.  Due 
to  the  dependence  of  D  on  6  and  thus  on  zo .  die  Omniscient 
Adversary  algorithm  is  not  universal.  The  redundancy  matrix  D 
is  known  to  all  parlies  -  Alice,  Bob,  and  Calvin  -  and  hence 
does  not  constitute  a  shared  secret. 

Alice  then  proceeds  to  the  standard  network  encoding.  She 
rearranges  X,  a  Icngth-6n  vector,  into  the  6  x  n  matrix  A'.  The 
jth  column  of  X  consists  of  symbols  from  the  ((j  -  1)6+  l)f/l 
through  (jb)ttl  symbols  of  X.  From  this  point  on,  Alice's 
encoder  implements  the  classical  random  network  encoder  de¬ 
scribed  in  §1V-B.  to  generate  her  transmitted  packets. 

Bob’s  Decoder:  As  shown  in  (9),  Bob's  received  data  is  related 
to  Alice  and  Calvin's  transmitted  data  as  Y  —  TX  4-  E.  Bob's 
objective,  as  in  §VI1.  is  to  distill  out  the  effect  of  the  error  matrix 
E  and  recover  the  vector  X.  He  can  then  retrieve  Alice's  data 
by  extracting  the  first  (bn  -  62  -  6n)  symbols  to  obtain  U. 

To  decode.  Bob  performs  the  following  steps,  each  of  which 
corresponds  to  an  elementary  matrix  operation. 

•  Determining  Calvin's  strength:  Bob  first  determines  the 
strength  of  the  adversary  zo ,  which  is  the  column  rank  of 
Tz~y.  Bob  docs  not  know  Tz-+y,  but  since  T  and  Tz~y 
span  disjoint  vector  spaces  (Claim  3),  the  column  rank  of  Y 
is  equal  to  the  sum  of  the  column  ranks  of  T  and  Tz^y. 
Since  the  column  rank  of  T  is  simply  the  batch  size  6.  Bob 
determines  zo  by  subtracting  b  from  the  column  rank  of  the 
matrix  Y. 

•  Discarding  irrelevant  information:  Since  the  classical  ran¬ 
dom  network  code  is  run  without  any  central  coordinating 
authority,  the  packets  of  information  that  Bob  receives 


may  be  highly  redundant.  Of  the  packets  Bob  receives,  he 
selectively  discards  some  so  that  the  resulting  matrix  Y  has 
6  +  20  rows,  and  has  full  row  rank.  For  him  to  consider 
more  packets  is  useless,  since  at  most  6  -F  zo  packets  of 
information  have  been  injected  into  the  network,  6  from 
Alice  and  zo  from  Calvin.  This  operation  has  the  additional 
benefit  of  reducing  the  complexity  of  linear  operations 
that  Bob  needs  to  perform  henceforth.  This  reduces  the 
dimensions  of  the  matrix  T,  since  Bob  can  discard  the  rows 
corresponding  to  the  discarded  packets. 

•  Estimating  a  "basis”  for  E:  If  Bob  could  directly  estimate 
a  basis  for  the  column  space  of  E>  then  he  could  simply 
decode  as  in  the  Shared  Secret  algorithm.  However,  there  is 
no  shared  secret  that  enables  him  to  discover  a  basis  for  the 
column  space  of  E.  So,  he  instead  chooses  a  proxy  error 
matrix  T"  whose  columns  (which  arc,  in  general,  linear 
combinations  of  columns  of  both  X  and  E)  act  as  a  proxy 
error  basis  for  columns  of  E .  This  is  analogous  to  step  (9), 
where  the  matrix  T  acts  as  a  proxy  transfer  matrix  for  the 
unknown  matrix  T. 

The  matrix  T"  is  obtained  as  follows.  Bob  selects  zo 
columns  from  V'  such  that  these  columns,  together  with  the 
6  columns  of  T,  form  a  basis  for  the  columns  of  Y.  Without 
loss  of  generality,  these  columns  correspond  to  the  first  zo 
columns  of  Y  (if  not.  Bob  simply  permutes  the  columns  of 
Y  to  make  it  so).  The  (6  +  zo)  x  zo  matrix  corresponding 
to  these  first  zo  columns  is  denoted  T". 

•  Changing  to  proxy  basis:  Bob  rewrites  Y  in  the  basis 
corresponding  to  the  columns  of  the  (6  +  zo)  x  (6  +  zo) 
matrix  [T,f  T\ .  Therefore  Y  can  now  be  w  ritten  as 


y  =  [T// 


Uo  Fz  0 

0  F*  h 


(12) 


Here 


Fz 

Fx 


is  defined  as  the  (6  +  zo)  x  (n  -  (6  +  zo)) 

matrix"  representation  of  the  columns  of  Y  (other  than  those 
in  [T"  T])  in  the  new  basis,  with  Fz  and  Fx  defined  as 
the  sub-matrices  of  appropriate  dimensions. 

Bob  splits  X  as  A'  =  [Xj  X2  X3],  where  A'j  corresponds  to 
the  first  zo  columns  of  X,  A3  to  the  last  6  columns  of  X, 
and  X2  to  the  remaining  columns  of  A'.  We  perform  linear 
algebraic  manipulations  on  (12),  to  reduce  it  to  a  form  in  which 
the  variables  in  A"  are  related  by  a  linear  transform  solely  to 
quantities  that  are  computable  by  Bob.  Claim  4  summarizes 
the  effect  of  these  linear  algebraic  manipulations  (proof  in 
Appendix). 

Claim  4:  The  matrix  equation  (12)  is  exactly  equivalent  to 
the  matrix  equation  fx2  =  T  ( Fx  +  X\FZ) . 

To  complete  the  proof  of  correctness  of  our  algorithm,  we  need 
only  the  following  claim,  proved  in  the  Appendix. 

Claim  5:  For  6n  >  n(zo  +e ),  with  probability  greater  than 
1  —  <7~n<r,  the  system  of  linear  equations 


tX2  =  t{Fx+XiFz) 
DX  =  0 


(13) 

(14) 


is  solvable  for  A'. 

The  final  claim  enables  Bob  to  recover  A',  which  contains 
Alice's  information  at  asymptotic  rate  R  =  C  -  2 zo-  (There  is 
an  asymptotically  negligible  rate  overhead  equalling  62/n  +  e. 
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The  b?/n  term  corresponds,  as  before,  to  the  identity  matrix 
appended  to  X.  The  term  e  takes  any  positive  value,  and  the 
probability  of  error  also  depends  on  it.)  The  probability  of  error 
equals  the  sums  of  the  probabilities  of  error  in  Claims  3  and  5, 
i.e.,  \T\\£\q~l  -f  q*™.  Of  code  design,  encoding  and  decoding, 
the  most  computationally  expensive  is  decoding.  The  costliest 
step  involves  inverting  the  linear  transform  corresponding  to 
(13HI4),  which  is  of  dimension  0{nC).  □ 

IX.  Limited  Adversary  Model 

We  combine  the  strengths  of  the  Shared  Secret  algorithm 
and  the  Omniscient  Adversary  algorithm,  to  achieve  the  higher 
rate  of  C  =  C  —  zo .  without  needing  a  secret  channel.  The 
caveat  is  that  Calvin’s  strength  is  more  limited;  the  number  of 
packets  he  can  transmit,  zo.  and  the  number  he  can  eavesdrop 
on,  zj ,  satisfy  the  technical  constraint 

2  zo  +  zj<C.  (15) 

Wc  call  such  an  adversary  a  Limited  Adversary. 

The  main  idea  underlying  our  Limited  Adversary  algorithm 
is  simple.  Alice  uses  the  Omniscient  Adversary  algorithm  to 
transmit  a  “short”  message  to  Bob  at  rate  C  -  2 zo  By  (15), 
zj  <  C  —  2zo,  the  rate  zj  at  which  Calvin  eavesdrops  is  strictly 
less  than  Alice’s  rate  of  transmission  C  —  2 zo-  Hence  Calvin 
cannot  decode  Alice’s  message,  but  Bob  can.  This  means  Alice’s 
message  to  Bob  is  secret  from  Calvin.  Alice  then  builds  upon 
this  secret,  using  the  Shared  Secret  algorithm  to  transmit  the 
bulk  of  her  message  to  Bob  at  the  higher  rate  C  —  zo . 

Though  the  following  algorithm  requires  Alice  to  know'  zo 
and  2/,  we  describe  in  §IX-A  how  to  change  the  algorithm  to 
make  it  independent  of  these  parameters.  The  price  we  pay  is 
a  slight  decrease  in  rate. 

Alice’s  Encoder:  Alice’s  encoder  follows  essentially  the  schema 
described  above,  except  for  a  technicality  -  the  information  she 
transmits  to  Bob  via  the  Omniscient  Adversary  algorithm  is 
padded  with  some  random  symbols.  This  is  for  two  reasons. 
Firstly,  since  the  Omniscient  Adversary  algorithm  has  a  prob¬ 
ability  of  error  that  decays  exponentially  with  the  size  of  the 
input,  it  isn’t  guaranteed  to  perform  well  to  transmit  just  a  small 
message.  Secondly,  the  randomness  in  the  padded  symbols  also 
ensures  strong  information-theoretic  secrecy  of  the  small  secret 
message,  i.e.,  we  can  then  show  (in  Claim  6)  that  Calvin's  best 
estimate  of  any  function  of  the  secret  information  is  no  better 
than  if  he  made  random  guesses. 

Alice’s  information  X  decomposes  into  two  parts  [A'i  Xj). 
She  uses  the  information  she  wishes  to  transmit  to  Bob,  at  rate 
R  =  C  —  zo  -  A,  as  input  to  the  encoder  of  the  Shared  Secret 
algorithm,  thereby  generating  the  b  x  n(l  -  A)  sub-matrix  X\. 
Here  A  is  a  parameter  that  enables  Alice  to  trade  off  between 
the  the  probability  of  error  and  rate-loss. 

The  second  sub-matrix,  X2,  which  wc  call  the  secrecy  matrix 
is  analogous  to  the  secret  S  used  in  the  Secret  Sharing  algorithm 
described  in  §VII.  The  size  of  X2  is  6  x  An.  In  fact,  X2  is  an 
encoding  of  the  secret  S  Alice  generates  in  the  Shared  Secret 
algorithm.  The  b(C  4-  1)  symbols  corresponding  to  the  parity 
symbols  {r^}  and  the  hash  matrix  //  are  written  in  the  form 
of  a  length-6(C  4-  1)  column  vector.  This  vector  is  appended 
with  symbols  chosen  uniformly  at  random  from  Fq  to  result  in 
the  length-(C  -  zq  -  <5)An  vector  U'.  This  vector  0'  could 


function  as  the  input  U  to  the  Omniscient  Adversary  algorithm 
operated  over  a  packet-size  An.  with  a  probability  of  decoding 
error  that  is  exponentially  small  in  An;  however,  we  actually 
use  a  hash  of  U'  to  generate  the  input  U  to  the  Omniscient 
Adversary  algorithm.  To  be  more  precise,  0  =  V\3\  where 
V  is  any  square  A/05  code  generator  matrix  1  of  dimension 
(C  —  zo  -  A) An,  known  to  all  parties  Alice,  Bob,  and  Calvin. 
As  we  sec  later,  hashing  U'  with  V’  strengthen  the  secrecy  of  S 
(and  enables  the  proof  of  Claim  6  below).  Alice  then  uses  the 
encoder  for  the  Omniscient  Adversary  algorithm  to  generate  A'2 
from  U. 

The  two  components  of  A,  i.e.,  A'j  and  A'2,  respectively 
correspond  to  the  information  Alice  wishes  to  transmit  to  Bob, 
and  an  implementation  of  the  low  rate  secret  channel.  The 
fraction  of  the  packet-size  corresponding  to  A'2  is  “small”, 
i.e.,  A.  Finally,  Alice  implements  the  classical  random  encoder 
described  in  §1V-B. 

Bob’s  Encoder:  Bob  arranges  his  received  packets  into  the  ma¬ 
trix  Y  =  [Yi  >2].  The  sub-matrices  Y\  and  >2  are  respectively 
the  network  transforms  of  X\  and  X2. 

Bob  decodes  in  two  steps.  Bob  first  decodes  Y2  to  obtain  S. 
He  begins  by  using  the  Omniscient  Adversary  decoder  to  obtain 
the  vector  U.  He  obtains  U'  from  0,  by  multiplying  by  V~x. 
He  then  extracts  from  U'  the  6(C+  1)  symbols  corresponding 
to  S.  The  following  claim,  proved  in  the  Appendix,  ensures  that 
S  is  indeed  secret  from  Calvin. 

Claim  6:  The  probability  that  Calvin  guesses  S  correctly  is 
at  most  q~b(c+  i.e.,  S  is  information-thcorctically  secret  from 
Calvin. 

Thus  Alice  has  now  shared  S  with  Bob.  Bob  uses  S  as  the 
side  information  used  by  the  decoder  of  the  Shared  Secret 
algorithm  to  decode  Yu  This  enables  him  to  recover  Xu  which 
contains  Alice’s  information  at  rate  R  =  C  —  zo.  (There  is 
an  asymptotically  negligible  rate  overhead  equalling  62/n4  A. 
The  b2/n  term  corresponds,  as  before,  to  the  identity  matrix 
appended  to  A'.  The  term  A  takes  any  positive  value,  and  the 
probability  of  error  also  depends  on  it.)  The  probability  of  error 
equals  the  sums  of  the  probabilities  of  error  in  Theorems  1 
and  2.  The  errors  in  Theorem  1  are  analyzed  in  Claims  3  and  2. 
Theorem  2  is  used  to  generate  codes  of  blocklength  An.  This 
probability  of  error  is  analyzed  in  Claim  5.  Together,  an  upper 
bound  on  the  probability  of  error  is  (|T||£|4ncC)<jT14-<7~An*. 
Since  the  Limited  Adversary  algorithm  is  essentially  a  con¬ 
catenation  of  the  Shared  Secret  algorithm  with  the  Omniscient 
Adversary  algorithm,  the  computational  cost  is  the  sum  of  the 
computational  costs  of  the  two  (with  An  replacing  n  as  the 
block-length  for  the  Shared  Secret  algorithm).  This  quantity 
therefore  equals  0(nC 2  -4  (AnC)3).  Choosing  A  appropriately 
(say  A  =  (C~4n~i)  makes  the  second  term  vanish.  □ 

A.  Limited  Adversary:  Universal  Codes 

We  now  discuss  how  to  convert  the  above  algorithm  to 
be  independent  of  the  network  parameters  zo  and  z/.  Alice’s 
challenge  is  to  design  for  all  possible  zq  and  zj  pairs  that  satisfy 
the  constraint  (15).  For  any  specific  zj%  Alice  needs  to  worry 
only  about  the  largest  zo  that  satisfies  (15)  because  what  works 

1  Secret  Sharing  protocols  125)  demonstrate  that  using  MDS  code  generator 
matrices  guarantees  that  to  infer  even  a  single  symbol  of  O'  from  U  requires 
the  entire  vector  0. 
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Adversarial 

Strength 

Rate 

Complexity 

Shared 

Secret 

zo  <  C , 
zj  as  network 

O 

1 

N 

0 

0(nCJ) 

Omniscient 

Zo  <  c/2, 
zt  =  network 

C  —  2zo 

0((nC)J) 

Limited 

2/4-2.20  <  C 

O 

1 

n 

0 

OfnC7) 

TABLE  Ill-Comparison  of  our  three  algorithms 

against  an  attacker  with  a  particular  traffic  injection  strength 
works  against  all  weaker  attackers.  Note  that  C,  zo>  and  zj  are 
all  integers,  and  thus  there  arc  only  C  —  1  such  altackers.  For 
each  of  these  attackers,  Alice  designs  a  different  secrecy  matrix 
A'2  as  described  above.  She  appends  these  C—  1  matrices  to  her 
information  X\  and  sends  the  result  as  described  in  the  above 
section. 

To  decode  Bob  needs  to  estimate  which  secrecy  matrix  to 
use,  i.e.,  which  one  of  them  is  secret  from  the  attacker.  For 
this  he  needs  a  good  upper  bound  on  zo-  But,  just  as  in  the 
omniscient  adversary  algorithm,  he  can  obtain  this  by  computing 
the  column  rank  of  Y%  and  subtracting  6  from  it.  He  then  decodes 
using  the  secrecy  matrix  corresponding  to  (zo,C  -  1  -  2 zo)- 
This  secrecy  matrix  suffices  since  zj  can  at  most  be  C— 1-2*0. 
which  corresponds  to  Calvin's  highest  eavesdropping  strength 
for  this  zo -  □ 

X.  Conclusion 

Random  network  codes  arc  vulnerable  to  Byzantine  adver¬ 
saries.  This  work  makes  them  secure.  We  provide  algorithms2 
which  are  information-theoretical ly  secure  and  rate-optimal  for 
different  adversarial  strengths  as  shown  in  Table  I.  When  the 
adversary  is  omniscient,  we  show  how  to  achieve  a  rate  of 
C  —  2 zo,  where  zo  is  the  number  of  packeis  the  adversary 
injects  and  C  is  the  network  capacity.  If  the  adversary  cannot 
observe  everything,  our  algorithms  achieve  a  higher  rate,  C—zq . 
Both  rates  are  optimal.  Further  our  algorithms  are  practical;  they 
arc  distributed,  have  polynomtal-time  complexity  and  require  no 
changes  at  the  internal  nodes. 

Acknowledgments 

The  authors  would  like  to  thank  Air  Force  grant  FA9550-06-1-0155, 
Caltech's  Lee  Center  For  Advanced  Networking  and  Microsoft  Re¬ 
search  for  support. 


References 

|t]  R.  Ahlswcde,  N.  Cai,  S.  R.  Li.  and  R.  W.  Yeung.  Network  informaiion 
flow*.  IEEE  Transactions  on  Information  Theory,  46(5):  1 2CM — 1216.  July 
2000. 

|2|  N.  Cai  and  R.  W  Yeung.  Secure  network  coding.  In  Proceedings 
of  International  Symposium  in  Information  Theory  and  Its  Applications . 
Lausanne.  Switzerland.  June  2002. 

1 3)  N.  Cai  and  R.  W  Yeung.  Network  error  correction,  pan  2:  Lower  bounds, 
submitted  to  Communications  in  Information  and  Systems.  2006. 

(4 1  D.  Charles.  K.  Jain,  and  K.  Lauicr  Signatures  for  network  coding.  In 
Proceedings  of  the  fortieth  annual  Conference  on  Information  Sciences 
and  Systems,  Princeton.  NJ.  USA.  2006. 

[5]  D.  Dolev.  C.  Dwork.  O.  Waarts,  and  M.  Yung.  Perfectly  secure  message 
transmission.  Journal  of  the  Association  for  Computing  Machinery. 
40(0:17-47.  January  1993. 

[61  C  Gkantsidis  and  P.  Rodriguez.  Network  coding  for  large  scale  content 
distribution.  In  Proceedings  of  IEEE  Conference  on  Computer  Communi¬ 
cations  (ISFOCOM),  Miami.  March  2005. 

[7|  C  Gkantsidis  and  P.  Rodriguez.  Cooperative  security  for  network  coding 
file  distribution.  In  Proceedings  of  IEEE  Conference  on  Computer 
Communications  (ISFOCOM).  Barcelona,  April  2006. 

2 A  refinement  of  some  of  the  algorithms  in  this  work  can  be  found  in  fill. 


[8]  T.  Ho..  R.  Koetter.  M.  MiManl,  D.  Kaiger.  and  M.  Effros.  The  benefits 
of  coding  over  routing  in  a  randomized  setting.  In  IEEE  International 
Symposium  on  Information  Theory  (IS/T),  page  442,  Yokohama,  July  2003. 

19)  T.  Ho.  M.  Mddard,  J.  Shi.  M.  Effros.  and  D.  Kargcr.  On  randomized 
nctw'ork  coding.  In  Proceedings  of  4 1st  Annual  Allerton  Conference  on 
Communication.  Control,  and  Computing.  Momicctlo.  1L,  2003. 

[  10J  T.  C  Ho.  B.  Lcong.  R,  Koetter.  M.  M£dard.  M  Effros.  and  D.  R.  Karger 
Bvzantine  modification  detection  in  multicast  networks  using  randomized 
network  coding.  In  International  Symposium  on  Information  Theory. 
Chicago.  USA.  June  2004. 

(II)  S.  Jaggi  and  M.  Langbcrg.  Resilient  network  coding  in  the  presence  of 
eavesdropping  byzantine  adversaries,  submined  to  IS1T,  2007. 

[121  S.  Jaggi.  M.  Langberg.  T.  Ho.  and  M  Effros.  Correction  of  adversarial 
errors  in  netw'orks  In  Proceedings  of  International  Symposium  in 
Information  Theory  and  its  Applications.  Adelaide.  Australia,  2005. 

1 1 3J  S.  Jaggi.  P.  Sanders.  P.  A.  Chou.  M.  Effros.  S.  Epter.  K.  Jain,  and 
L  Tolhuizen.  Polynomial  time  algorithms  for  multicast  network  code 
construction  IEEE  Transactions  on  Information  Theory,  5 1  (6):  1 973- 1 982, 
June  2005. 

[  14  J  A.  Jiang.  Network  coding  for  joint  storage  and  transmission  with  minimum 
cost  In  Proceedings  of  International  Symposium  In  Information  Theory 
and  its  Applications,  Seattle.  Washington.  USA.  July  2006. 

[15]  S.  Katti.  D.  Katabi.  W.  Hu.  H.  S.  Rahul,  and  M.  Mddard.  The  Importance 
of  Being  Opportunistic  Practical  Network  coding  for  Wireless  Environ¬ 
ments.  In  43rd  Annual  Allerton  Conference  on  Communication,  Control, 
and  Computing .  Allerton,  2005. 

I16J  S.  Katti.  K  Rahul.  D.  Katabi.  W.  H.  M.  MMard  and  J.  Crow-croft.  XORs 
in  the  Air  Practical  Wireless  Network  Coding.  In  ACM  SIGCOMM.  Pisa. 
Italy.  2006. 

( 17)  R.  Koetter  and  F.  Kschischang.  Coding  for  errors  and  erasures  in  random 
network  coding.  Under  Submission. 

1I8J  R.  Koetter  and  M  M6dard  Beyond  routing:  An  algebraic  approach  io 
network  coding.  In  Proceedings  of  the  21st  Anmml  Joint  Conference  of  the 
IEEE  Computer  and  Communications  Societies  (ISFOCOM).  volume  I. 
pages  122-130.  2002. 

[19]  R.  Koetter  and  M.  Mtfdard.  An  algebraic  approach  to  network  coding. 
IEEEJACM  Transactions  on  Networking,  1 1(5):782-795,  October  2003. 

(20[  M.  N.  Krohn.  M.  J.  Freedman,  and  D.  Mazires.  On-the-fly  verification  of 
ratcless  erasure  codes  for  efficient  content  distribution.  In  Proceedings  of 
the  IEEE  Symposium  on  Security  and  Primes.  2004.  Oakland,  California. 

121]  S.-Y.  R.  Li.  R.  W.  Yeung,  and  N.  Cal  Linear  network  coding,  IEEE 
Transactions  on  Information  Theory.  49(2):37 1-381.  2003. 

|221  D.  Lun.  M.  M6dard,  T.  Ho.  and  R.  Koetter.  Network  coding  with  a 
cost  criterion.  In  Proceedings  of  International  Symposium  in  Information 
Theory  and  its  Applications.  October  2004. 

[23]  D.  S.  Lun.  M.  Mddard.  and  R  Koencr.  Efficient  operation  of  wireless 
packet  networks  using  network  coding.  In  International  Workshop  on 
Convergent  Technologies  (IWCT),  Oulu.  Finland.  2005. 

1 24[  A.  Pelc  and  D  Peleg.  Broadcasting  with  locally  bounded  byzantine  faults. 
Information  Processing  Letters.  93(3):  109-1 15.  February  2005. 

1 25 1  T.  Rabin  and  M  Ben-Or.  Verifiable  secret  sharing  and  multiparty  protocols 
w  ith  honest  majority.  In  Proceedings  of  the  twenty-first  annual  ACM 
sy  mposium  on  Theory  of  computing,  pages  73-85.  Seattle.  Washington, 
United  States,  1989. 

1 26 1  i„  Subramanlan.  Decentralized  Security  Mechanisms  for  Routing  Proto¬ 
cols.  PhD  thesis.  University  of  California  at  Berkeley.  Computer  Science 
Division.  Berkeley.  CA  94720.  2005. 

[271  H  J  Wertz.  On  the  numerical  inversion  of  a  recurrent  problem:  The 
Vandermonde  matrix.  AC- 10:492.  OcL  1965. 

[28[  J-  E.  Wieselthier.  G.  D.  Nguyen,  and  A.  Ephremides.  On  the  construction 
of  energy-efficient  broadcast  and  multicast  trees  in  wireless  networks.  In 
IEEE  Infocom.  volume  2.  pages  585-594.  IEEE,  2000. 

[29|  R.  W.  Yeung  and  N.  Cai.  Network  error  correction,  part  I  Basic  concepts 
and  upper  bounds,  submitted  to  Communications  in  Information  and 
Systems.  2006. 


A.  Proof  of  Claim  2 


APPENDIX 


The  idea  behind  the  claim  is  as  follows.  The  parity-check  matrix  P 
is,  by  construction,  a  Vandermonde  nuitrix  [27),  and  therefore  has 
full  column  rank.  Further,  since  P  is  hidden  from  Calvin,  with  high 
probability  he  cannot  choose  interference  such  that  the  matrix  product 
EP  has  a  lower  column  rank  than  does  E. 

To  prove  this  we  use  a  generalization  of  an  argument  used  in  [12]. 
Let  denote  the  (i,  j)  element  of  5  =  EP.  We  note  that  for  each 
(t\  j),  St  j  can  be  thought  of  as  a  polynomial  in  r,  with  coefficients 
from  the  ith  row  of  E.  Since  St,j  has  degree  at  most  n  in  rj.  at  most 
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n  values  of  r,  satisfy  the  equation  S,fJ  =  c,  for  any  scalar  c  €  F,,. 
Since  Calvin  docs  not  know  the  values  of  the  rjs%  the  probability  he 
can  choose  entries  in  E  to  satisfy  any  such  equation  is  at  most  nq”1. 

In  particular,  the  probability  that  the  first  row  of  S  consists  of 
the  length-C  zero  vector  is  at  most  (nq~l)c.  For  a  particular  choice 
of  the  first  row  of  5,  the  probability  that  the  second  row  is  linearly 
dependent  on  the  first  row  (i.e.,  is  any  scalar  multiple  of  the  first  row) 
is  at  most  nc/qc~l.  Similarly,  the  probability  that  the  third  row  is 
any  of  the  (f  possible  linear  combinations  of  the  first  two  rows  is  at 
most  nc/<r  .  Continuing  thus,  the  probability  that  the  ith  row  of 
S  is  linearly  dependent  on  the  previous  ?  -  1  is  at  most  nc /qc”m. 
Taking  the  union  bound  over  all  C  events,  the  probability  that  S  is 
singular  is  at  most  nc  1/<7C~,+1-  Since  the  largest  summand 
equals  oFjq,  therefore  the  probability  of  the  undesirable  event  is  at 
most  Cvrq~x.  Hence,  with  probability  at  least  1  —  Cncq  l,  E  and 
5  are  related  via  an  invertible  transformation.  Note  that  q  is  a  design 
parameter  and  can  be  chosen  to  be  much  larger  than  Cnc  to  make  the 
probability  of  error  arbitrarily  small.  □ 

B.  Proof  of  Claim  3 

The  proof  of  Claim  3  follows  directly  from  [9].  Essentially,  it  is  a 
consequence  of  the  following  facts.  First,  due  to  (9),  with  probability 
at  least  (1  —  |T|q~l),f  over  network  code  design,  [T  Tz-~y)  has  full 
column  rank.  Here  |T|  is  the  number  of  terminals  in  the  multicast 
connection,  and  \£\  is  the  number  of  (hyper)  links  in  the  underlying 
network.  Secondly,  the  matrix  [T  S']  can  be  obtained  via  an  invertible 
transformation  from  the  matrix  [T  Tz— y).  Lastly,  for  large  enough  q, 
the  quantity  (1  -  ITIq”1)1^  is  strictly  greater  than  1  —  |T||£|q  □ 

C.  Proof  of  Claim  4 

Rewriting  the  right-hand  side  of  (12)  and  substituting  for  Y  from  (8) 
results  in 

TX  +  TZ-.y(Z  -  LX)  =  f  [0  Fx  h]  +  T"(/.o  Fz  0).  ( 16) 

Since  the  columns  of  T"  arc  spanned  by  the  columns  of  [T  Tz-.y], 
therefore  we  may  write  T"  as  TM\  4-  Tz—vA/2.  where  the  matrices 
M\  and  M2  represent  the  appropriate  basis  transformation  Thus  (16) 
becomes 

fX  +  7W(Z-LA')  = 

t  ([0  Fx  /,])  +  (rA/,  +  7WA/j)  [I. o  Fz  0],  (17) 

Since  the  vector  spaces  spanned  by  the  columns  of  T  and  7>— v  are 
disjoint  (except  in  the  zero  vector),  therefore  we  may  compare  the  term 
multiplying  the  matrix  T  on  both  sides  of  17  (we  may  also  compare 
the  term  corresponding  to  Tz—y,  but  this  gives  us  nothing  useful). 
This  comparison  gives  us  the  equation 

TX  =  t[ 0  Fx  h]  + t A/i  [1.0  Fz  0],  (18) 

We  split  the  matrix  equation  (16)  into  three  parts,  corresponding  to  the 
sub-matrices  A*i,  A'j  and  A'3  of  A'.  Thus  (18)  now  splits  into  the  three 
equations 

fA',  =  f  (19) 

fx a  =  TFX  ■+■  f Mi  Fz ,  and  (20) 

fX3  =  f,  (21) 

Equation  (21)  is  trivial,  since  it  only  reiterates  that  A'3  equals  columns 

of  an  identity  matrix.  Equation  (19)  allows  us  to  estimate  that  A/, 
equals  A',.  Wc  arc  finally  left  with  (20),  which  by  substituting  for  M\ 
from  (19)  reduces  to 

TXi  =  t  (F*  +  ,Y, Fz)  .  (22) 

a 

D.  Proof  of  Claim  5 

For  i  =  1,2,  we  denote  by  X,  the  vector  obtained  by  stacking  the 
columns  of  X ,  one  after  the  other  Let  D  —  \D\  D2].  where  D2 
corresponds  to  the  last  b 2  columns  of  D  and  D\  corresponds  to  the 


remaining  columns  of  D.  Define  a  =  n  —  (6  +  zo).  Denote  by  Fx 
the  vector  formed  by  stacking  columns  of  the  matrix  Fx  one  after  the 
other,  and  by  /,,>  the  (f,  j)tr*  entry  of  the  matrix  Fz .  The  system  of 
linear  equations  (13MI4)  can  be  written  in  matrix  form  as 


where  A  is  given  by 
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This  matrix  A  is  described  by  smaller  dimensional  matrices  as 
entries.  The  matrix  T  has  dimensions  (6  4-  zo)  x  6.  The  jih  row 
of  matrices  in  the  top  portion  of  matrix  A  describes  an  equation 
corresponding  to  the  jth  column  of  the  matrix  equation  in  Equation  13. 
The  bottom  portion  of  A  corresponds  to  Equation  14.  Bob  can  recover 
the  variables  X(i,j)  if  and  only  if  the  above  matrix  A  has  full  column 
rank.  Wc  now  analyze  A  to  show  that  this  is  indeed  the  case  (with  high 
probability)  for  sufficiently  large  An.  Using  Claim  3,  we  can  assume 
that  f  has  full  column-rank,  and  therefore  the  last  cib  columns  of  the 
matrix  (represented  by  the  right  side  of  A)  have  full  column  rank. 

We  now  address  the  left  columns  of  A.  Consider  performing 
column  operations  from  right  to  left,  to  zero  out  the  Ts  in  the  left 
side  of  the  top  rows  of  A  (that  is,  to  zero  out  the  upper  left  sub-matrix 
of  A).  A  has  full  column  rank  iff  after  this  process  the  lower  left  sub- 
matrix  of  A  has  full  column  rank.  We  show  that  this  is  the  case  with 
high  probability  over  the  random  elements  of  D  (when  6n  is  chosen  to 
be  sufficiently  large).  Let  /,/s  be  the  values  appearing  in  the  upper  left 
sub-matrix  of  A.  We  show  that  for  any  (adversarial)  choice  of  /,/s, 
with  high  probability,  the  act  of  zeroing  out  the  T’s  yields  a  low  er  left 
sub-matrix  of  A  with  full  column  rank  Then  using  the  union  bound 
on  all  possible  values  of  ftj  we  obtain  our  assertion. 

For  any  fixed  values  of  let  C(j).  for  j  =  1  to  bzo,  denote 
the  columns  of  the  lower  left  sub-matrix  of  A  after  zeroing  out  the 
T’s.  For  each  7,  the  vector  C(j)  is  a  linear  combination  of  the  (lower 
pan  of  the)  jtJl  column  of  A  with  columns  from  the  lower  right  sub- 
matrix  of  A.  As  the  entries  of  D\  are  independent  random  variables 
uniformly  distributed  in  Fw.  the  columns  C(J)  for  j  =  1  ,...,6zo 
consist  of  independent  entries  that  are  also  uniformly  distributed  in  F,. 
Standard  analysis  shows  that  the  probability  that  the  columns  C(j)  are 
not  independent  is  q*>xo~6n  For  the  union  bound  we  would  like  this 
probability  to  be  at  most  =  ^  -(n-tk+^orirc-nr  Thus,  it 

suffices  to  take  5n  =  n(zo  4-  e)  for  an  error  probability  of  at  most 
q“ntf.  Recall  that  b  —  C  —  zo . 

E.  Proof  of  Claim  6 

The  vector  U  was  generated  from  U'  via  an  MDS  code  generator 
matrix  (see  Footnote  I),  and  a  folklore  result  about  network  codes 
is  that  with  high  probability  over  random  network  code  design  the 
linear  transform  between  Alice  and  Calvin  also  has  the  MDS  property. 
Thus,  for  Calvin  to  infer  even  a  single  symbol  of  the  lcngth-(C  - 
zo  —  6)nA  vector  U',  he  needs  to  have  received  at  least  (C  —  zo  — 
<5)nA  linear  combinations  of  the  variables  in  the  secrecy  matrix  AY 
Since  Calvin  can  overhear  zj  packets,  he  has  access  to  z/nA  equations 
that  are  linear  in  the  unknown  variables.  The  difference  between  the 
number  of  variables  unknown  to  Calvin,  and  the  number  of  equations 
Calvin  has,  is  linear  in  nA  -  for  large  enough  nA.  this  difference  is 
larger  than  b(C  4-1).  the  length  of  the  vector  S.  By  a  direct  extension 
of  [25],  Calvin’s  probability  of  guessing  any  function  of  S  correctly  is 
q-b(c+i)'  Q 
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Random  Linear  Network  Coding: 
A  free  cipher? 

Luisa  Lima  Muriel  M6dard  Joao  Barros 


Abstract — We  consider  the  level  of  information  security  pro¬ 
vided  by  random  linear  network  coding  in  network  scenarios  in 
which  all  nodes  comply  with  the  communication  protocols  yet  are 
assumed  to  be  potential  eavesdroppers  (Le«  “nice  but  curious*’). 
For  this  setup,  which  differs  from  wiretapping  scenarios  consid¬ 
ered  previously,  we  develop  a  natural  algebraic  security  criterion, 
and  prove  several  of  its  key  properties.  A  preliminary  analysis 
of  the  impact  of  network  topology  on  the  overall  network  coding 
security,  in  particular  for  complete  directed  acyclic  graphs,  is 
also  included. 

Index  Terms — security,  information  theory,  graph  theory,  net¬ 
work  coding. 


I.  Introduction 

Under  the  classical  networking  paradigm,  in  which  inter¬ 
mediate  nodes  are  only  allowed  to  store  and  forward  packets, 
information  security  is  usually  viewed  as  an  independent 
feature  with  little  or  no  relation  to  other  communication  tasks. 
In  fact,  since  intermediate  nodes  receive  exact  copies  of  the 
sent  packets,  data  confidentiality  is  commonly  ensured  by 
cryptographic  means  at  higher  layers  of  the  protocol  stack. 
Breaking  with  the  ruling  paradigm,  network  coding  allows 
intermediate  nodes  to  mix  information  from  different  data 
flows  [1],  [2J  and  thus  provides  an  intrinsic  level  of  data 
security  —  arguably  one  of  the  least  well  understood  benefits 
of  network  coding. 

Previous  work  on  this  issue  has  been  mostly  concerned 
with  constructing  codes  capable  of  spliting  the  data  among 
different  links,  such  that  reconstruction  by  a  wiretapper  is 
either  very  difficult  or  impossible.  In  [31,  the  authors  present 
a  secure  linear  network  code  that  achieves  perfect  secrecy 
against  an  attacker  with  access  to  a  limited  number  of  links. 
A  similar  problem  is  considered  in  [4],  featuring  a  random 
coding  approach  in  which  only  the  input  vector  is  modi¬ 
fied.  [5J  introduces  a  different  information-theoretic  security 
model,  in  which  a  system  is  deemed  to  be  secure  if  an 
eavesdropper  is  unable  to  get  any  decoded  or  decodable  (also 
called  meaningful)  source  data.  Still  focusing  on  wiretapping 
attacks,  [61  provides  a  simple  security  protocol  exploiting  the 
network  topology:  an  attacker  is  shown  to  be  unable  to  get  any 
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Fig.  1.  Canonical  Network  Coding  Example.  In  this  image,  intermediate  nodes 
are  represented  with  squares.  With  this  code,  node  4  is  a  vulnerability  for 
the  network  since  it  can  decode  all  the  information  sent  through  it.  Note 
that  the  complete  opposite  happens  for  node  5.  that  receives  no  meaningful 
information  whatsoever. 


meaningful  information  unless  it  can  access  those  links  that 
are  necessary  for  the  communication  between  the  legitimate 
sender  and  the  receiver,  who  are  assumed  to  be  using  network 
coding.  As  a  distributed  capacity-achieving  approach  for  the 
multicast  case,  randomized  network  coding  [7],  [8]  has  been 
shown  to  extend  naturally  to  packet  networks  with  losses  [9] 
and  Byzantine  modifications  (both  detection  and  correction 
[10],  [1 1],  [12],  [13]).  [14]  adds  a  cost  criterion  to  the  secure 
network  coding  problem,  providing  heuristic  solutions  for  a 
coding  scheme  that  minimizes  both  the  network  cost  and 
the  probability  that  the  wiretapper  is  able  to  retrieve  all  the 
messages  of  interest. 

In  this  work,  we  approach  network  coding  security  from 
a  different  angle:  our  focus  is  not  on  the  threat  posed  by 
external  wiretappers  but  on  the  more  general  threat  posed 
by  intermediate  nodes.  We  assume  that  the  network  consists 
entirely  of  “nice  but  curious”  nodes,  i.e.  they  comply  with 
the  communication  protocols  (in  that  sense,  they  are  well- 
behaved)  but  may  try  to  acquire  as  much  information  as 
possible  from  the  data  that  passes  through  them  (in  which  case, 
they  are  potentially  malicious).  This  notion  is  highlighted  in 
the  following  example. 

Example  1:  Consider  the  canonical  network  coding  exam¬ 
ple  with  7  nodes,  shown  in  Figure  1.  Node  1  sends  a  flow 
to  sinks  6  and  7  through  intermediate  nodes  2,  3,  4  and  5, 
From  the  point  of  security,  we  can  distinguish  between  three 
types  of  intermediate  nodes  in  this  setting:  (1)  those  that  only 
get  a  non-meaningful  part  of  the  information,  such  as  node 
5;  (2)  those  that  obtain  all  of  the  information,  such  as  node 
4;  and  (3)  those  that  get  partial  yet  meaningful  information, 
such  as  nodes  2  and  3.  Although  this  network  code  could  be 
considered  secure  against  single-edge  external  wiretapping  — 
i.e. ,  the  wiretapper  is  not  able  to  retrieve  the  whole  data  simply 


by  eavesdropping  on  a  single  edge  —  it  is  clearly  insecure 
against  internal  eavesdropping  by  an  intermediate  node. 

Motivated  by  this  example*  we  set  out  to  investigate  the 
security  potential  of  network  coding.  Our  main  contributions 
arc  as  follows: 

•  Problem  Formulation :  We  formulate  a  secure  network 
coding  problem*  in  which  all  intermediate  nodes  are 
viewed  as  potential  eavesdroppers  and  the  goal  is  to 
characterize  the  intrinsic  level  of  security  provided  by 
random  linear  network  coding. 

•  Algebraic  Security  Criterion :  Based  on  the  notion  that  the 
number  of  dccodable  bits  available  to  each  intermediate 
node  is  limited  by  the  degrees  of  freedom  it  receives*  we 
are  able  to  provide  a  natural  secrecy  constraint  for  net¬ 
work  coding  and  to  prove  some  of  its  most  fundamental 
properties. 

•  Security  Analysis  for  Complete  Directed  Acyclic  Graphs : 
As  a  preliminary  step  towards  understanding  the  interplay 
between  network  topology  and  security  against  eaves¬ 
dropping  nodes,  we  present  a  rigorous  characterization 
of  the  achievable  level  of  algebraic  security  for  this  class 
of  complete  graphs. 

The  remainder  of  this  paper  is  organized  as  follows.  First*  a 
formal  problem  statement  is  in  Section  //*  followed  by  a  de¬ 
tailed  analysis  of  the  algebraic  security  of  Randomized  Linear 
Network  Coding  in  Section  111 .  In  Section  /V,  this  analysis  is 
carried  out  specifically  for  complete  directed  acyclic  graphs. 
The  paper  concludes  with  Section  V. 

II.  Problem  Setup 

We  adopt  the  network  model  of  [2]:  we  represent  the 
network  as  an  acyclic  directed  graph  G  =  (V,  F),  where 
V  is  the  set  of  nodes  and  E  is  the  set  of  edges.  Edges 
are  denoted  by  round  brackets  e  —  (v,v')  Q  F*  in  which 
v  =  head(e)  and  vf  =  tail(e).  The  set  of  edges  that  end  at  a 
vertex  v  6  V  is  denoted  by  P; (v)  =  {e  e  E  :  head(e)  =  v}, 
and  the  in-degree  of  the  vertex  is  S/(v)  =  |r/(r)|;  similarly, 
the  set  of  edges  originating  at  a  vertex  v  €  V  is  denoted 
by  To(tO  =  {e  G  E  :  tail(e)  =  r}*  the  out-degrcc  being 
represented  by  6o( v)  =  |ro(u)|. 

Discrete  random  processes  A'i,  ...A'#  are  observable  at  one 
or  more  source  nodes.  To  simplify  the  analysis,  we  shall 
consider  that  each  network  link  is  free  of  delays  and  that 
there  are  no  losses.  Moreover,  the  capacity  of  each  link  is 
one  bit  per  unit  time*  and  the  random  processes  Xi  have  a 
constant  entropy  rate  of  one  bit  per  unit  time.  Edges  with 
larger  capacities  are  modelled  as  parallel  edges  and  sources 
of  larger  entropy  rate  are  modelled  as  multiple  sources  at  the 
same  node  We  shall  consider  multicast  connections  as  it  is 
the  most  general  type  of  single  connection;  there  are  d  >  1 
receiver  nodes.  The  objective  is  to  transmit  all  the  source 
processes  to  each  of  the  receiver  nodes. 

In  linear  network  coding*  edge  e  —  (v,  u)  carries  the  process 
Y(e),  which  is  defined  below: 


The  transfer  matrix  M  describes  the  relationship  between 
an  input  vector  x  and  an  output  vector  z,z  —  xM\  M  = 
A(I  -  F)~‘1Br>  where  A  and  B  represent,  respectively,  the 
linear  mixings  of  the  input  vector  and  of  the  output  vector,  and 
have  sizes  K  x  \E\  and  v  x  \E\.  F  is  the  adjacency  matrix 
of  the  directed  labelled  line  graph  corresponding  to  the  graph 
G .  In  this  paper  we  shall  not  consider  matrix  F,  which  only 
refers  to  the  decoding  at  the  receivers.  Thus,  we  shall  mainly 
analyse  parts  of  the  matrix  AG ,  such  that  G  =  (I  —  F)~l; 
a,  and  c{  denote  column  i  of  A  and  AG ,  respectively.  We 
define  the  partial  transfer  matrix  (a*so  Cflbcd  auxiliary 

encoding  vector  [9J)  as  the  observable  matrix  at  a  given  node 
v,  i.c.  the  observed  matrix  formed  by  the  symbols  received  at 
a  node  v.  This  is  equivalent  to  the  fraction  of  the  data  that  an 
intermediate  node  has  access  to  in  a  multicast  transmission. 

Regarding  the  coding  scheme,  we  consider  the  random 
linear  network  coding  scheme  introduced  in  [7]:  and  thus 
each  coefficient  of  the  matrices  described  above  is  chosen 
independently  and  uniformly  over  all  elements  of  a  finite  field 
F,,  q  =  2m. 

Our  goal  is  to  evaluate  the  intrinsic  security ?  of  random 
linear  network  coding,  in  multicast  scenarios  where  all  the 
intermediate  nodes  in  the  network  arc  potentially  malicious 
eavesdroppers.  Specifically  our  threat  model  assumes  that 
intermediate  nodes  perform  the  coding  operations  as  outlined 
above*  and  will  try  to  decode  as  much  data  as  possible. 

III.  Algebraic  Security  of  Random  Linear 
Network  Coding 

A.  Algebraic  security 

The  Shannon  criterion  for  information-theoretic  secu¬ 
rity  [15]  corresponds  in  general  terms  to  a  zero  mutual 
information  between  the  cypher-text  (C)  and  the  original 
message  (A/),  i.e.  I(M;C)  =  0.  This  condition  implies  that 
an  attacker  must  guess  <  //(A/)  symbols  to  be  able  to 
compromise  the  data.  With  network  coding*  on  the  other  hand, 
if  the  attacker  is  capable  of  guessing  A/  symbols*  K  —  M 
additional  observed  symbols  arc  required  for  decoding  —  by 
noting  that  each  received  symbol  is  a  linear  combination  of 
the  A  message  symbols  from  the  source*  we  can  see  that  a 
receiver  must  receive  K  coded  symbols  in  order  to  recover  one 
message  symbol.  Thus,  as  will  be  shown  later,  restricted  rank 
sets  of  individual  symbols  do  not  translate  into  immediately 
dccodable  data  with  high  probability.  This  notion  is  illustrated 
in  Figure  2.  In  the  scheme  shown  on  top*  each  intermediate 
node  can  recover  half  of  the  transmitted  symbols*  whereas  in 
the  bottom  scheme  none  of  the  nodes  can  recover  any  portion 
of  the  sent  data. 

Definition  1  (Algebraic  Security  Criterion):  The  level  of 
security  provided  by  random  linear  network  coding  is  mea¬ 
sured  by  the  number  of  symbols  that  an  intermediate  node  v 
has  to  guess  in  order  to  decode  one  of  the  transmitted  symbols. 
From  a  formal  point  of  view, 


Y(c)  =  <*i*X(v,l)+  Y,  0e'.'Y(e')  K  -  (rank(A/f/(v))  +  ld 

t:Xi  generated  at  v  e':hcad(ef)=tail(e)  S\v)  ft 
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Fig.  2.  Example  of  algebraic  security.  In  the  upper  scheme  data  is  not 
protected,  whereas  in  the  lower  scheme  nodes  2  and  3  are  unable  to  recover 
any  data  symbols. 


where  U  represents  the  number  of  partially  diagonalizable 
lines  of  the  matrix  (i.e.  the  number  of  message  symbols  that 
can  be  recovered  by  Gaussian  elimination). 

Notice  that  the  previous  definition  is  equivalent  to  comput¬ 
ing  the  difference  between  the  global  rank  of  the  code  and 
the  local  rank  in  each  intermediate  node  v .  Moreover,  as  more 
and  more  symbols  become  compromised  of  security  criteria, 
the  level  of  security  tends  to  0,  since  as  we  shall  show  in 
this  section,  with  high  probability  the  number  of  individually 
decodable  symbols  Id  goes  to  zero  as  the  size  of  the  field  goes 
to  infinity. 


B.  Security  Characterization 

We  are  now  ready  to  solve  the  problem  of  characterizing 
the  algebraic  security  of  random  linear  network  coding.  The 
key  to  our  proofs  is  to  analyze  the  properties  of  the  partial 
transfer  matrix  at  each  intermediate  node.  Recall  that  there  are 
two  cases  in  which  the  intermediate  node  can  gain  access  to 
relevant  information:  (I)  when  the  partial  transfer  matrix  has 
full  rank  and  (2)  when  the  partial  transfer  matrix  has  diagonal¬ 
izable  parts.  Thus,  we  shall  carry  out  independent  analyzes  in 
terms  of  rank  and  in  terms  of  partially  diagonalizable  matrices. 
The  following  lemmas  will  be  useful. 

Lemma  l:  In  the  random  linear  network  coding  scheme, 

P{AS  >  0)  <  P(3v  :  6r(v)  >  K). 


Proof:  See  the  Appendix.  ■ 

It  follows  from  this  lemma  that  it  is  only  necessary  to  consider 
the  case  in  which  K  <  6j(v). 

Lemma  2:  The  probability  that  a  linear  combination  of 
independent  and  uniformly  distributed  values  in  Fq  yields  the 
zero  result  is  bounded  by 


P(XUn  =  0)  < 


-4-  h(q) 


where  h(q)  is  a  function  such  that  0(h(q))  <  0(q2).  More¬ 
over,  P{Xun  =  0)  tends  to  0  when  q  -+  oo. 

Proof:  See  the  Appendix.  ■ 

Lemma  3:  The  probability  of  obtaining  y  zeros  in  one  line 
of  the  £  x  £  transfer  matrix  A/  is  bounded  by 


P(Y  =  y)< 


Proof:  See  the  Appendix.  ■ 

Theorem  1:  Let  P(ld  >  0)  be  the  probability  of  recovering 
a  strictly  positive  number  of  symbols  Id  at  the  intermediate 
nodes  with  6i(v)  <  K  —  1  by  Gaussian  elimination.  Then, 
P(ld  >  0)  — >  0  with  q  -+  oo  and  K  — ♦  oo. 

Proof:  Let  A!'  be  the  transpose  of  the  partial  transfer 
matrix  at  some  vertex  v.  A/'  =  A/jT  ,vy  We  consider  the 
process  of  Gaussian  elimination  of  Ar.  It  is  unnecessary 
to  consider  rank  A',  since  in  that  case  the  matrix,  w.h.p, 
is  invertible  and  hence  diagonalizable  [8].  Thus,  A/'  is  a 
6j(v)  x  K  matrix,  <5/(t;)  <  K. 

We  prove  the  theorem  constructively  by  analysing  the 
probability  of  having  A'  —  1  zeros  in  one  or  more  lines  of 
A/'.  Let  p  be  the  probability  of  having  K  —  1  zeros  in  a 
line  of  A/',  and  let  AT  be  a  random  variable  representing  the 
recoverable  number  of  symbols  when  an  intermediate  node 
has  degrees  of  freedom.  It  follows  from  Lemma  3  that 


In  the  base  case  with  S/(v)  =  1,  at  most  X  =  1  symbols  can 
be  recovered,  since  there  are  not  enough  degrees  of  freedom 
to  perform  Gaussian  elimination  and  the  only  chance  for 
recovering  a  symbol  is  that  the  line  of  the  matrix  A/  already 
has  K  —  1  zeros.  The  probability  for  this  is  p. 

In  the  case  that  1  <  A/(v)  <  A\  we  can  obtain  directly  a 
number  L  =  l  of  lines  with  A'— 1  zeros,  and  a  number  6j(y)—l 
of  lines  in  the  opposite  situation.  Since  we  have  6/(t>)  degrees 
of  freedom  to  perform  Gaussian  elimination,  we  can  obtain  at 
most  S/(v)  symbols  by  successive  elimination.  At  each  step 
the  probability  of  obtaining  a  line  with  K  —  1  zeros  is  bounded 
by  p. 

By  analysing  the  different  possibilities  of  combinations  for 
the  lines  that  already  have  AT  —  1  zeros  and  the  ones  that  can 
be  obtained  by  Gaussian  elimination,  we  get 

p{x =*)<£  a  -  p)tM~ip,(x = x) 

1=0  '  1  ' 


*<*—>*(,  .‘itw  - 

where  F\(X  —  x)  represents  P(X  =  x\L  =  l). 

Approximating  the  binomial  distribution  by  a  normal  distri¬ 
bution  yields 


Pi(X  =  x)  « 


_ e!_ _ 

v/2tt(<$/(v)  —  Tjp{  1  —  p)  ’ 


where 


e'  =  exp 


(  l(*-(fr(r )-i)p)2\ 
V  2  (S/(v)  —  l)p(l  —p)J 


Since  p  —+  p*  <  1.  wc  can  state  that,  when  q  oo  and  p  — ►  0 
is  «  exp(x2).  When  K  goes  to  oo,  so  does  x .  and  hence 


exp(i2)x-.oo  —  0, 


and 


Pl(X  =  K  -  l^cc./C-oo 


0. 
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Since 

P( X  =  A'-l)  =  Y,  (S'[V'>)p,(l-p)S,iv)~lPi(X  =  A"-l), 

and  Pt(X  =  K  —  1)  decreases  exponentially,  and  l  only 
increases  linearly, 

P(X  =  K  -  l)^o.K-co  -  0. 

The  probability  of  obtaining  A"  <  K  —  1  symbols  is  bounded 
by  P(X  =  K  —  1);  it  follows  that  the  probability  of  decoding 
X  symbols  with  any  6/(t/)  <  K  goes  to  zero  as  q  and  K  tend 
to  infinity.  ■ 

IV.  Algebraic  Security  of  the  Complete  Graph 

Notice  that,  in  consequence  of  the  properly  outlined  in 
Lemma  1,  the  algebraic  security  of  a  graph  is  topology 
dependent.  A  node  with  5j( v)  >  K  will  not  necessarily 
receive  a  full-rank  partial  transfer  matrix.  The  rank  depends 
on  the  available  paths  between  sources  and  each  intermediate 
node.  More  specifically,  depending  on  the  topology  of  the 
graph,  some  nodes  may  receive  only  combinations  of  symbols 
derived  from  matrices  with  restricted  rank,  i.e.  less  than  I\. 
This  includes,  for  example,  trees,  where  a  node  connected 
directly  to  the  source  by  a  link  of  capacity  C  can  only  have 
children  that  receive  at  most  rank  C. 

As  a  first  step  towards  general  network  models,  we  consider 
the  case  of  complete  acyclic  directed  graphs  G  =  (V,#), 
n  —  \V\,  which  can  be  generated  as  follows. 

•  Generate  random  labels  for  the  n  vertices.  These  have 
some  ordering  {ei,e2,..., Cn}  associated  to  them; 

•  Make  an  outgoing  (directed)  edge  from  the  vertex  with 
the  minimum  label  to  every  vertex  with  a  higher  label; 

•  Continue  until  we  reach  a  vertex  where  there  are  no  more 
possibilities  for  connections. 

This  algorithm  generates  a  complete  acyclic  directed  graph 
with  one  source,  one  sink  and  |£|  =  n(n  —  l)/2  edges, 
since  the  total  degree  of  each  vertex  is  n  -  1  =  Sj(v)  4- 
So(v).  The  source  and  the  sink  are  naturally  determined  as 
those  nodes  that  have  only  outgoing  edges  or  only  incoming 
edges,  respectively.  The  ordering  ensures  that  this  algorithm 
always  generates  an  acyclic  directed  graph,  conferring  the 
graphs  generated  in  this  way  specific  properties  such  as  the 
distribution  of  the  in  and  out-degrees.  These  properties  can  be 
determined  directly  from  the  order  of  the  vertex  using  <5o(u)  = 
n  —  ordcr{v)  and  Sj(v)  =  n  —  6o(v)  —  1  =  order(v)  —  1. 

Before  proving  our  next  theorem,  we  introduce  the  follow¬ 
ing  lemmas. 

Lemma  4:  In  complete  acyclic  directed  graphs,  a  node  that 
receives  R  symbols,  receives  w.h.p.  a  partial  transfer  matrix 
with  rank  equal  to  min (R,K). 

Proof:  Sec  the  Appendix.  ■ 

Lenmia  5:  For  the  complete  directed  acyclic  graph,  w.h.p.. 

K  —  min(/C.  order(u)) 


Theorem  2:  Let  0  s  be  the  secure  max-flow,  defined  as 
the  maximum  number  of  symbols  that  may  be  secured  in 
a  transmission  by  using  random  linear  network  coding.  For 
a  complete  acyclic  directed  graph  with  n  nodes,  the  secure 
max-fiow  equals  the  max-flow  min-cut  capacity  of  the  network 
and  is  n  —  1.  Conversely,  the  minimum  numbers  of  required 
symbols  for  secured  transmission  is  n  —  1  symbols. 

Proof: 

Suppose,  by  contradiction,  that  K  =  n  —  1  is  the  max- 
flow  min-cut  capacity  of  the  complete  directed  acyclic  graph. 
The  maximum  order  of  an  intermediate  node  v  is  n  —  2,  thus 
by  Lemma  5  we  have  As(r)  =  l/(n  —  1).  It  follows  that  the 
secure  max-flow  of  the  complete  acyclic  directed  graph  equals 
the  capacity  of  the  graph. 

By  contradiction,  let  the  minimum  number  of  required 
symbols  for  secured  transmission  be  m9  <  n  —  2.  There 
exists  an  intermediate  node  v  such  that  order(i?)  =  n  —  1, 
and  consequently,  As(u)  =  0.  Then  the  minimum  number  of 
required  symbols  for  secure  transmission  is  ms  =  n  —  1. 

■ 

It  follows  that  the  way  to  secure  this  class  of  complete 
graphs  is  to  transmit  at  the  max-flow  min-cut  capacity,  if 
necessary  by  adding  “dummy”  symbols. 

V.  Conclusions 

Intrigued  by  the  security  potential  inherent  to  random  linear 
network  coding,  we  developed  a  specific  algebraic  security 
criterion,  for  which  we  proved  a  set  of  key  properties.  Perhaps 
one  of  the  most  striking  conclusions  of  our  analysis  is  that 
algebraic  security  with  network  coding  is  very  dependent  on 
the  topology  of  the  network.  As  an  example,  we  focused  on 
complete  acyclic  directed  graphs,  and  determined  the  secure 
max-flow,  as  well  as  the  minimum  number  of  symbols  required 
for  algebraic  security.  As  part  of  our  ongoing  work,  we  are 
extending  this  analysis  to  other  more  general  network  models. 
Ultimately,  we  would  like  to  develop  secure  communication 
protocols  capable  of  exploiting  random  linear  network  coding 
as  an  almost  free  cypher. 
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Proof  of  Lemma  2 


Contrary  to  the  sum,  the  product  of  independent  and 
uniformly  distributed  values  in  is  not  independent  and 
uniformly  distributed.  In  fact,  there  are  two  ways  to  obtain 
a  zero  in  a  multiplication  in  F^:  (1)  by  multiplication  between 
an  element  a  €  F^  and  0,  and  (2)  by  multiplication  over  two 
elements  a  e¥q  and  b  e¥q,  such  that  a  ^  0  and  6^0.  but 
ab  =  0.  Now,  the  total  number  of  entries  of  the  multiplicative 
table  between  q  elements  of  Fq  is  qr2,  and  there  are  at  most  2 q 
instances  of  the  first  case:  q  instances  of  ab  =  0,  a  =  0  and 
6  7^  0,  and  q  instances  of  ab  =  0,  a  =  0  and  6  ^  0.  As  for 
the  second  ease,  it  is  possible  to  prove  by  contradiction  that 
the  number  of  zeros  obtained  this  way  is  strictly  less  than  q2: 
if  this  was  not  the  case,  all  products  of  elements  of  Fq  would 
be  zero,  and  that  is  absurd.  Since  this  is  true  for  any  q ,  the 
number  of  zeros  grows  0(/t(<;))  <  0(q2).  Thus,  wc  have 


P(Xun  =  0)  < 


2<7  +  /i(<7) 
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Since  for  large  enough  q  we  have  (2  +  h(q))/q  <  1,  it  follows 
that 


P(Xlin  —  0)<j— oo  —  0. 


Appendix 

Proof  of  Lemma  J 

Wc  will  prove  this  constructively  in  terms  of  the  ranks  of 
parts  of  the  transfer  matrix.  The  auxiliary  encoding  vector  in 
each  intermediate  node  v  is  given  by 

M{-liu)  =  (A(I-Frl)  rf(v)) 

where  A/^^  denotes  the  columns  of  the  matrix  correspond¬ 
ing  to  the  incoming  edges  of  v.  The  dimension  of  A7p^v)  is 
K  x  6/(v)y  with  6j(v)  <  \E\. 

To  determine  the  rank  of  the  partial  transfer  matrix,  we  note 
that  the  transfer  matrix  A7  =  A(I  -  F)”l£r  for  the  network 
must  be  invertible,  and  hence.  rank(A7)  =  K.  On  the  other 
hand,  to  determine  the  rank  of  A (7  —  F)_I  we  use  the  fact 
that  (I  —  F)~l  is  invertible  and  thus  rank((7  -  F)”1)  =  |F|. 
We  also  have 

rank(A(7  —  F)”1)  <  |F|, 

because  the  dimension  of  >1(7-  F)_l)  is  K  x  |F|.  But,  since 

rank(A(7  -  F)“*  BT)  =  K  —  min(rank(>l(7  -  F)-1),  B) 

holds  and  K  <  |F|  (true  because  K  must  be  less  than  the 
minimum  cut  in  the  network)  we  conclude  that 

rank(A(7  -  F)"1)  =  K. 

We  now  consider  As(v)  at  some  vertex  v.  For  that,  we  can 
consider  two  distinct  cases:  the  first  one  is  if  K  <  In 

this  case,  wc  cannot  assume  anything  about  As(v).  since  the 
rank  of  the  matrix  will  be  dependent  on  the  topology 

of  the  network.  As  for  the  second  case,  rank(J\7p/(u))  <  K  =$* 

As(t>)  <0.  ■ 


Proof  of  Lemma  3 

Each  position  of  a  line  of  the  transfer  matrix  A 7  is  a  linear 
combination  of  independently  and  uniformly  chosen  values  in 
F9,  and  thus,  the  probability  of  obtaining  a  zero  in  a  position 
is  given  by  Lemma  2.  The  result  follows  by  considering  all  the 
combinations  of  the  possible  positions  in  which  the  Y  zeros 
may  occur.  ■ 

Proof  of  Lemma  4 

Suppose  that  a  given  intermediate  node  receives  72  =  K +0 
symbols,  0  >  0.  It  is  clear  that  the  maximum  possible  rank  is 
K  and  thus  there  is  a  way  to  remove  0  columns  s.t.  the  rank 
of  the  resulting  set  will  still  be  at  maximum  K.  Now  consider 
the  ease  in  which  vertex  v  receives  at  most  K  symbols.  If  the 
columns  are  linearly  dependent,  the  condition 

{■Tfiifl/i,  4- Xh2Ch2  4- ...  4-  XfitxCfXn  =  (0...0)  }t 

such  that  Xhx,Xh2y..',xun  notallO,  e  F,  and  h\,  h2,  ...,/in 
represent  the  columns  e  will  be  satisfied.  Since  the 

linear  combination  of  lines  of  the  transfer  matrix  is  again  a 
linear  combination  of  independent  and  uniformly  distributed 
values  in  Fq,  it  follows  from  Lemma  3  that  the  probability  of 
obtaining  (0...0)T  tends  to  0  when  q  -+  oo  and  K  — ►  oo, 
and  thus,  the  columns  /ii,/*2i  —%hn  €  r/(u)  are  linearly 
independent  w.h.p.  ■ 

Proof  of  Lemma  5 

It  follows  from  Lemma  4  that  w.h.p.,  the  number  of  symbols 
received  by  a  vertex  is  the  rank  of  the  partial  transfer  matrix 
received  (and  at  most  I<)  and  thus 

Ag{v)  K  -  min(AWtQ)  = 

K  —  min(7<',order(u)  -  1) 
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Abstract — The  problem  of  error  control  in  random  network 
coding  is  considered,  and  a  formulation  of  the  problem  is  given 
in  terms  of  rank-metric  codes.  Thb  formulation  allows  many  of 
the  tooLs  developed  for  rank-metric  codes  to  be  applied  to  random 
network  coding.  A  random  network  code  induces  a  generalized 
decoding  problem  for  rank-metric  codes  in  which  the  channel 
may  supply  partial  information  about  the  error  in  the  form  of 
erasures  (knowledge  of  an  error  location  not  its  values)  and 
deviations  (knowledge  of  an  error  value  but  not  its  location). 

I.  Introduction 

While  nindoin  network  coding  [1)  is  an  effective  technique 
for  information  dissemination  in  communication  networks, 
it  is  higlily  susceptible  to  errors.  The  insertion  of  even  a 
single  corrupt  packet  has  the  potential,  when  linearly  combined 
with  legitimate  packets,  to  affect  all  packets  gatlicred  by  an 
information  receiver.  The  problem  of  error  control  in  random 
network  coding  is  therefore  of  great  interest. 

In  this  paper,  we  focus  on  end-to-end  error  control  coding. 
Internal  network  nodes  are  assumed  to  be  unaware  of  the  pres¬ 
ence  of  an  outer  code;  they  simply  create  outgoing  packets  as 
random  linear  combinations  of  incoming  packets  in  the  usual 
manner  of  random  network  coding.  Unlike  some  approaches 
to  error  control  in  network  coding  (e.g.,  [2],  [3])  we  assume 
that  the  source  and  destination  nodes  have  no  knowlege — or 
at  least  make  no  effort  to  exploit  know  ledge — of  the  topology 
of  the  network  or  of  the  particular  network  code  used  in  the 
network. 

Two  previous  “noncoherent"  or  “channel-blind"  approaches 
to  data  transmission  in  coded  networks  are  closely  related  to 
the  work  of  tliis  paper.  Jaggi  et  al.  [4]  provide  polynomial- 
time  rate-optimal  network  codes  that  combat  Byzantine  adver¬ 
saries  Their  approach  is  based  on  probabilistic  assumptions 
tliat  require  both  the  field  size  and  the  packet  length  to  be 
sufficiently  large.  In  contrast,  Koetter  and  Kschischang  [5] 
take  a  more  combinatorial  approach  to  the  problem.  Their  key 
observation  is  that  a  random  coded  network  may  be  regarded 
as  a  noncoherent  multiplc-mput  multiple-output  system  (over 
a  finite  field).  In  tliis  context  an  appropriate  encoding  of 
information  is  the  choice  by  the  transmitter  of  a  suitable  vector 
space  V%  rather  than  a  vector  as  in  classical  coding  theory  The 
choice  of  the  space  V  is  signalled  by  insertion  into  the  netw  ork 
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of  a  basis  for  V%  where  each  basis  vector  corresponds  to  a 
transmitted  packet  As  V  is  closed  under  linear  combinations 
of  vectors,  random  network  coding  will  (in  the  absence  of 
noise)  preserve  the  space,  even  as  it  mixes  the  transmitted 
vectors  themselves.  A  receiver  collects  packets,  which  are 
assumed  to  form  a  basis  for  a  receiv  ed  space  U,  Correct 
reception  is  possible  provided  that  U  and  V  intersection 
in  a  space  of  sufficiently  large  dimension.  By  defining  an 
appropriate  metric  on  subspaccs,  a  generalization  of  classical 
coding  theory  in  the  Hamming  metric  becomes  possible.  Tins 
approach  works  for  any  given  field  and  imposes  virtually  no 
constraints  on  packet  size. 

Tliis  paper  is  motivated  by  the  results  of  [5]  and  addresses 
the  construction  of  practical  codes.  Our  main  contribution 
is  to  show  that,  for  a  large  class  of  codes,  the  subspace 
distance  metric  of  [5]  and  the  rank  metric  (e  g.,  [6],  (7J) 
are  strongly  related,  allowing  many  of  the  tools  from  the 
theory  of  rank-metric  codes  to  be  naturally  applied  to  random 
network  coding.  We  note,  however,  tliat  our  approach  is  not 
a  straightforward  application  of  rank-metnc  codes.  Under 
random  network  coding,  two  phenomena  can  occur,  called 
here  erasures  and  deviations,  that  depart  from  tlic  conventional 
notion  of  rank  errors.  Erasures  and  dev  iations  are  dual  to  each 
other  and  correspond  to  partial  information  about  the  error 
matrix,  akin  to  the  role  played  by  symbol  erasures  in  the 
Hamming  metric.  In  our  context,  an  erasure  corresponds  to 
the  knowledge  of  an  error  location  but  not  its  value,  while  a 
dev  ration  correspond  to  the  know  ledge  of  an  error  value  but 
not  its  location.  These  concepts  generalize  similar  concepts 
found  in  the  rank-metric  literature  under  the  terminology  of 
"row  and  column  erasures"  [8]— [10]  In  related  work  [11],  vve 
provide  an  efficient  decoding  algorithm  for  Gabidulin  codes 
[7]  that  takes  into  account  erasures  and  deviations.  However, 
space  limitations  make  it  impossible  to  describe  these  results 
here. 

The  remainder  of  tliis  paper  is  organized  as  follows.  In 
Section  il,  vve  prov  ide  a  brief  description  of  the  rank  metric 
and  its  properties.  In  Section  111,  vve  provide  a  formulation 
of  the  problem  of  error  control  in  random  network  coding. 
Finally,  in  Section  IV,  vve  demonstrate  the  strong  connection 
between  certain  decoding  problems  in  lire  rank  metric  and  the 
subspace  decoding  approach  of  |5].  Proofs  have  been  omitted 
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throughout;  however,  see  [11). 


II.  Preliminaries 


Let  q  >  2  be  a  power  of  a  prime.  In  this  paper,  all 
vectors  and  matrices  are  defined  over  the  finite  field  F9,  unless 
otherwise  mentioned.  We  denote  F”xm  as  the  set  of  all  n  x  m 
matrices  over  ¥q  and  we  set  F£  =  F£xl.  Thus,  v  e  FJ  is  a 
column  vector,  whereas  v  €  Fjxm  is  a  row  vector. 

If  v  is  a  v  ector,  then  the  symbol  vi  denotes  the  iih  entry  of  i\ 
For  matrices,  the  notation  varies  according  to  how  the  matrix 
is  defined  A\,...,Ak  could  be  cither  the  rows  of  a  matrix 

Mil 


A  = 


or  the  columns  of  a  matrix  A  =  \Aiy . .  .,-4*]. 


yu 

Tlie  distinction  will  always  be  clear  from  context.  In  cilhcr 
case,  live  symbol  Aij  always  refers  to  the  entry  in  the  tth  row 
and  jth  column  of  A. 

The  k  xk  identity  matrix  is  denoted  by  hxk-  If  /  = 
tlien  the  notation  U  refers  to  the  nth  column  of  I. 

We  will  make  extensive  use  of  the  column  v  ector  variables 


Li,L2?...  €  FJ  and  the  row  vector  variables  6 

F*xm.  For  tliese  variables  only,  we  introduce  the  notations 

v,i 


L\  =  [LJt...,Le]  and  Vj  = 
L\  VJ  can  be  expanded  as 


Note  that  the  product 


(» 

k—a 


since  (Li ^  =  £^7^. 

If  S  C  {1, . . .  ,*}  and  A  e  F£x*,  then  As  -  [Au  t  €  S\ 
will  refer  to  the  matrix  consisting  of  the  columns  of  A  that 
arc  indexed  by  S ,  placed  in  natural  (increasing)  order. 

Let  /  -  Inxn,UQ  {1 and 
Tlie  matrices  Iu  and  Fyc  will  be  extensively  used  in  Section  IV 
to  simplify  notation  For  any  A  €  (respectively,  A  6 
F£xn),  the  matrix  ffiA  (resp.,  AIu)  extracts  the  rows  (resp , 
columns)  of  A  indexed  by  U.  Conversely,  for  any  B  e  F q4  xk 
(resp.,  B  e  FqX]U  )  tlie  matrix  IUB  (resp.,  BFfi)  reallocates 
the  rows  (resp  ,  columns)  of  B  to  the  positions  indexed  by 
U,  where  all-zero  rows  (resp.,  columns)  arc  inserted  at  the 
positions  indexed  by  Uc.  Finally,  observe  that  lu  and  Iuc 
satisfy  the  following  properties 

/  —  lulu  "1  /*«#.,  (2) 

luhi  =  I[u\x\u\  (3) 

IS  =  (4) 


Let  Ar  €  F£xm.  We  use  (X)  and  rank  A'  to  denote, 
respectively,  the  row  space  and  the  rank  of  A'.  By  definition, 
rank  A”  =  dim(Ar).  An  equivalent  definition  of  rank  is  the 
following 

rank  A'  =  min  r.  (5) 

r,L\V{: 

X-L\V{ 


It  is  well-known  that,  for  any  A',  Y  e  FJXTf\  we  have 
rank  (A"  +  Y)  <  rank  A"  +  rank  Y. 
Proposition  I:  For  any  X  e  F£xm  and  Y  €  Fjxm, 


rank 


(6) 


=  rank  A’  +  rank  Y  -  dim  {A")  n  (Y)  (7) 
=  rank  A"  +  min  rank  (Y  -  A  A').  (8) 

/iCFj*n 


Proposition  2:  For  any’  A'  e  FJXTn,  Y  €  Fjxm  and  r  <  A, 


min 

rank  A>r 


rank(Y-AA)  *  rank 


Y 

X 


-rank  A+cv(A\Y) 


(9) 


where  Cr(A',  Y)  =  max{r  —  n  -f  rank  X  —  rank  Y,  0}. 

A  matrix  code  C  is  defined  as  any  subset  of  F£xm.  A 
matrix  code  is  also  commonly  known  as  an  array  code  when 
it  forms  a  linear  space  over  Fq  [12].  It  follows  from  (6)  that 
the  following  distance  function  is  a  metric  [7]: 

Definition  /;  Tlie  rank  distance  between  matrices  a,  b  € 
pnxm  js  (jcfmccj  as  6)  =  rank  (6  —  a). 

A  rank-metric  code  is  am  matnx  code  used  in  the  context 
of  the  rank  distance  metric.  The  minimum  distance  of  a  rank 
metric  code  is  the  minimum  rank  distance  of  the  code  among 
all  pairs  of  codewords. 

Consider  a  rank-metric  code  C  C  F”xm  and  let  c  be  a 
codeword  in  C.  Suppose  an  error  matrix  e  €  F£xm  is  added 
to  c  and  the  matrix  r  =  c  {  e  is  received.  The  standard  rank 
decoding  problem  is  to  find  a  codeword  ceC  that  minimizes 
tlie  rank  distance  between  r  and  c,  that  is. 


c  —  argminrank(r  -  c).  (10) 

ecc 

Now,  if  the  error  matrix  e  =  r  -  c  has  rank  r,  we  can  use 
(5)  and  (1)  to  write 

r 

e  =  L\v;  =  <n> 

where  tlie  vectors  Lj, . . . ,  Lr  and  Vx, . . . ,  Vr  are  as  defined 
above 

Vectors  Vj  and  Lj  can  be  thought  of  as  giv  ing  tlie  value  and 
the  location,  respectively,  of  tlie  jth  error  component  Namely, 
the  error  v  alue  Vj  appears  (multiplied  by  the  coefficient  Lij ) 
in  every  row  i  of  e  for  which  is  nonzero. 

Note  that  if  m  —  1  and  Lj  is  a  unit  vector  (tliat  is,  a 
zero/onc  vector  with  a  single  one)  for  all  j,  then  this  rank- 
metric  description  of  tlie  error  naturally  reduces  to  that  of 
tlie  Hamming  metric.  Thus,  rank  decoding  can  be  seen  as 
a  gencralizaUon  of  minimum  Hamming  distance  decoding. 
Unlike  the  classical  case,  however,  tlie  description  of  the  error 
matrix  in  terms  of  general  Lj  and  Vj  is  not  unique.  Namclv, 
if  6  =  LjV7,  then  we  also  have  e  =  {L\T){T-'V{)  for  any 
nonsingular  matnx  T  €  FJxr. 
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III.  Error  Correction  in  Random  Network  Coding 

We  consider  a  point-to-point  communication  network  with 
a  single  source  node  and  single  destination  node  The  source 
node  selects  a  message  w  from  a  set  W  and  encodes  this  mes¬ 
sage  into  n  packets  X\,...,Xn  e  F^t  which  are  regarded 
as  the  incoming  packets  to  tlie  source  node.  Each  node  in  the 
network,  including  the  source,  performs  standard  distributed 
network  coding  [1):  whenever  a  node  lias  a  transmission 
opportunity  it  produces  an  outgoing  packet  as  a  random 
Fg-lincar  combination  of  all  the  incoming  packets  it  lias 
until  then  received.  Tlic  destination  node  collects  N  packets 
Y\ , . . . ,  VV  €  FA/  and  decodes  these  packets  into  a  estimate 
w  €  VV  of  tlic  original  message.  The  decoding  is  successful 
w  henever  w  =  w. 

Let  Ar  be  an  7a  x  A/  matrix  whose  rows  are  tlx:  transmitted 
packets  X\,...rXn  and,  similarly,  let  Y  be  an  N  x  M  matrix 
whose  rows  are  the  received  packets  V'i  , . . . ,  Y}v  Since  all 
packet  operations  arc  linear  over  F^,  thca  regardless  of  the 
network  topology,  the  transmitted  and  received  packets  can  be 
related  by  the  following  expression: 

Y  =  AX,  (12) 

where  A  is  an  N  x  n  matrix  corresponding  to  the  overall  Unear 
transformation  applied  by  the  network  (note  that  any  linear 
packet  operations  performed  at  the  source  node  are  considered 
part  of  the  netwoik). 

Before  proceeding,  we  note  that  this  model  encompasses  a 
variety  of  situations. 

•  The  network  may  have  cycles  or  delays.  Since  the  overall 
system  is  linear,  expression  (12)  will  be  true  regardless 
of  the  network  topology. 

•  Tlic  network  could  be  wireless  instead  of  wired.  In  this 
case,  we  simply  constrain  each  intermediate  node  to  send 
exactly  the  same  packet  in  each  of  its  outgoing  links. 

•  Tlic  source  node  may  want  to  transmit  more  than  one 
message.  In  tliis  case,  we  assume  that  each  packet  carries 
an  index  of  the  message  to  which  it  corresponds  and 
that  packets  with  different  message  indices  arc  processed 
separately  throughout  the  network  [13]. 

•  The  netwoik  topology  may  exhibit  time-v  ariance  as  nodes 
join  and  leave  and  connections  are  established  and  lost.  In 
this  case,  tlic  model  can  still  be  preserved  by  considering 
each  link  to  be  the  instantiation  of  a  successful  packet 
transnussioa 

•  The  network  may  operate  in  multicast  mode,  i.c.,  there 
may  be  more  than  one  destination  node.  Again,  ex¬ 
pression  (12)  still  applies,  where  the  matrix  A  may  be 
different  for  each  destination. 

Let  us  now  extend  litis  model  to  incorporate  packet  errors. 
Suppose  that  at  the  input  of  each  link,  a  corrupting  packet 
may  be  added  to  the  packet  being  transmitted  at  that  link  Let 
Zi  €  FAf  denote  the  comipting  packet  applied  at  the  input  of 
link  i,  where  we  assume  that  the  links  are  indexed  from  1  to 
L.  By  linearity  of  tlic  network,  wc  can  write 

Y  =  AX  +  BZ  (13) 


where  Z  is  an  L  x  M  matrix  whose  rows  arc  Z], ,  Z and 
B  is  an  N  x  L  matrix  corresponding  to  the  overall  linear 
transformation  incurred  by  the  corrupting  packets  until  the 
destination. 

Observe  tliat  tills  model  can  represent  not  only  the  occur¬ 
rence  of  random  link  errors  but  also  the  action  of  malicious 
nodes,  in  tlie  following  way  We  assume  tliat  each  node, 
malicious  or  not,  creates  a  prescribed  outgoing  packet  as  a 
random  linear  combination  of  incoming  packets,  as  described 
above  A  non-malicious  node  then  simply  transmits  tliis  pre¬ 
scribed  packet  as  its  outgoing  packet.  A  malicious  node  may 
cither  operate  as  a  non-malicious  node  or  may  replace  tins 
prescribed  packet  by  one  of  the  following:  an  aibitrary  linear 
combircition  of  the  incoming  packets  or  an  arbitraiy  packet 
which  may  not  be  a  linear  combination  of  the  incoming 
packets.  Refusing  to  transmit  a  packet  corresponds  to  sending 
the  trivial  linear  combinatioa  i.e.,  an  all-zero  packet.  Note 
that  all  these  operations  can  be  represented  in  the  model  as 
the  addition  of  a  corrupting  packet  to  the  prescribed  outgoing 
packet,  thus  (13)  holds  The  number  of  nonzero  rows  of  Z 
gives  exactly  the  number  of  “packet  interventions”  performed 
by  a  malicious  node  and  thus  give  a  sense  of  the  “power” 
employed  by  this  node  towards  jamming  the  network. 

Let  Enc:  W  -♦  F£xAf  be  the  encoding  function  applied 
by  tlic  source  node.  Consider  the  decoding  operation  at  the 
destination  node  Performing  maximum-likelihood  decoding 
of  the  message  to  given  the  receiv  ed  matrix  Y  would  require 
know  ledge  of  the  statistics  of  A,  B  and  Z,  which  may  not  be 
available  or  may  be  too  hard  to  find.  A  more  modest  goal  may 
be  to  find  a  message  that  minimizes  the  number  of  corrupting 
packets  introduced  in  the  network,  owing  to  the  assumption 
that  adversarial  power  is  somehow  limited  or  costly. 

If  A  and  B  are  known  to  the  destination  node,  tlicn  wc 
may  define  the  network  decoding  problem  as  the  problem  of 
finding 

w  —  argmin  min  wt(Z)  (14) 

weW  Y-AX+BZ 
X  —  Enc(u>) 

where  wt(Z)  denotes  the  number  of  nonzero  rows  of  Z.  A 
ty  pical  situation  where  this  problem  arises  is  when  both  the 
network  code  and  the  network  topology  are  deterministic  and 
known  to  the  destination  node  [2],  [3]. 

Here,  we  are  interested  in  encoding  and  decoding  functions 
tliat  operate  under  no  assumptions  on  tlic  matrices  A  and 
B  (except  possibly  some  constraint  on  the  rank  of  A).  Any 
valid  explanation  of  tlic  received  Y  in  terms  of  A',  A.  B 
and  Z  is  conceivable,  and  tlic  decoder  attempts  to  find  one 
tliat  minimizes  wt(Z).  Thus,  we  define  the  random  network 
decoding  problem  as  the  problem  of  finding 

w  —  argmin  min  wt(Z)  (15) 

u>fcW  a,b,z. 

Y=AX+BZ 
X  —  Enc(ui) 
rank  A>r 

where  r  <  Ar  is  some  lower  bound  on  the  rank  of  A. 
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The  ability  to  range  over  all  possible  ,4  and  B  actually 
facilitates  the  problem,  since  we  can  now  find  the  value  of  the 
inner  minimization  in  (15). 

Theorem  3:  Let  A'  €  FJxAf.  V'  €  F"xAf,  Z  €  F£*A'. 
A  €  F”xn  and  B  €  where  n,  Ar,  M  and  L  >  N 

are  positive  integers.  Let  VV  be  n  finite  set  and  Enc:  W  — * 
pnxitf  jj,c  rai,dom  network  decoding  problem  (15)  can  be 
reformulated  as 


w  —  argmin  rank 

ui6W 


Y 

X 


-  rank  X  +  c,(X,  Y) 


X=sEnc(w) 

(16) 


where  Cr(X,  Y)  =  max{r  —  n  +  rank  A'  -  rank  Y,  0}. 

From  Theorem  3,  we  observe  that  performing  any  elemen¬ 
tal  row  operations  in  A'  or  Y  does  not  change  the  decoding 
problem.  Thus,  from  the  point  of  view  of  the  decoder,  what  is 
transmitted  is  not  the  actual  matrix  A",  but  only  the  row  space 
of  X,  Likewise,  it  is  the  row  space  of  Y  what  is  effectively 
received  by  the  destination  node.  Tliis  observation  provides 
a  close  connection  between  the  random  network  decoding 
problem  proposed  in  this  paper  and  the  coding  theory  for 
subspaces  introduced  in  [5]. 

Observe  also  that,  if  rank  A"  is  a  constant  for  all  valid  A, 
then  the  term  CrfA^y)  docs  not  depend  on  the  transmitted 
message  and  can  be  omitted  from  the  minimization. 

From  now  oa  we  can  assume  that  rank  V  =  Ar,  since  only  a 
basis  for  (Y)  needs  to  be  considered  as  the  received  matrix.  In 
practice,  this  means  tliat  any  linear  dependent  received  packet 
may  be  safely  discarded  by  the  destination  node. 

The  approach  we  propose  is  this  paper  is  characterized  by 
choosing  the  message  set  as  a  rank-metric  code  W  =  C  C 
jpnxm  an(j  setting  Enc(x)  =  [/  x ]  for  all  x  e  C,  where  we 
assume  M  =  n  -F  in  >  n. 


Note  tliat,  in  the  error-free  case,  we  may  chose  C  =  F£xm 
and  obtain  exactly  the  standard  random  network  coding  ap¬ 
proach  (1),  [13].  The  vectors  xi, . . .  ,xn  may  be  interpreted  as 
data  packets ,  and  each  of  the  transmitted  packets  Aj , . . . ,  Xn 
is  formed  by  appending  a  header  at  the  beginning  of  the 
corresponding  data  packet,  so  that  Ar  =  [/  x].  The  received 
matrix  will  then  be  given  by  Y  —  [A  Ax\,  from  which  x 
can  be  recovered  if  A  has  rank  n. 

Proposition  4:  Let  W  =  C  C  Xm  and  Enc(x)  =  [/  xj 
for  all  x  e  C.  If  Y  =  \A  y],  then  the  random  network 
decoding  problem  (16)  becomes  the  problem  of  finding 


x  =  argmin  rank  (y  —  Ax).  (17) 


Note  tliat  if  A  is  square  and  invertible,  then  rank  (y— Ax)  = 
rank(>4“1y  —  x)  =  rank(r  —  x),  where  r  =  A~ly.  In  tliis 
case,  we  obtain  the  conventional  rank  decoding  problem  of 
finding  a  codeword  x  e  C  tliat  is  closest  in  rank  distance  to 
r.  Thus,  at  least  in  this  case,  it  is  clear  that  the  rank  distance 
is  the  “right”  metric  for  tins  problem,  and  that  we  should  use 
a  rank-metric  code  with  large  minimum  rank  distance. 

In  the  case  where  A  is  not  invertible,  a  general  approach 
could  be  to  define  a  new  code  Cf  =  AC  =  { Ax ,  ieC}  and 
find  a  codeword  x'  e  C  that  lias  the  smallest  rank  distance 


to  r  Then,  any  pre-image  of  x'  in  C  will  be  a  solution  to  the 
random  network  decoding  problem.  Tliis  approach  liowever, 
has  the  inconvenience  tliat  a  new  code  would  have  to  be 
used  at  each  decoding  instance,  which  may  lead  to  decoding 
inefficiency  (an  efficient  decoder  for  the  code  C  may  not  even 
be  known).  Instead,  we  would  like  to  find  a  decoding  problem 
where  the  structure  of  C  could  still  be  exploited.  Tliis  is  the 
subject  of  the  next  scctioa 


IV.  A  Generalized  Rank  Decoding  Problem 

In  this  section  we  explore  tlic  case  when  A  is  not  square 
and  invertible. 

Define  p  and  6  such  that  rank  A  —  n  —  p  and  /V  =  n  —  p  -f 

T 

S .  Choose  an  N  x  A'  nonsingular  matrix  T  =  rJ  ,  where 

i  2 

T\  and  T 2  have  n  —  p  and  S  row  s,  respectively,  such  that 
rank  T\A  =  n  -  p  and  T^A  =  0.  Such  a  matrix  T  can  be 
found  by  performing  Gaussian  elimination  on  A.  Note  tliat, 
since  rank  TY  =  Ar,  we  must  have  rank  T^y  =  6. 

We  can  now  rewTite  our  objective  function  as 


rank  (y  —  Ax)  -  rank  T(y  -  Ax)  —  rank 


T\y  -  TiAx 
V 

(18) 

where  V  —  T2y 

Let  us  first  examine  the  case  where  p  =  0,  i.e.,  rank  A  =  n. 
We  can  choose  T\  such  tliat  T\A  =  /  and  obtain 


rank 


Tiy  -  T\Ax 
V 


—  rank 


r  —  x 
V 


(19) 


where  r  =  T\y. 

From  (8),  w  e  obtain 


=  6  -f  min  rank (r  —  x  —  LV). 


lef;*4 


(20) 


Thus,  we  obtain  a  problem  very  similar  to  minimizing  the 
rank  distance  between  r  and  x,  except  for  the  presence  of 
the  term  LV  Roughly  speaking,  this  means  tliat  any  rank 
difference  that  could  be  “explained  away”  by  V  is  not  counted 
in  tlie  rank  distance. 

Let  us  now  proceed  to  the  general  case.  Similarly  as  above, 
we  would  like  to  choose  T\  in  such  a  way  tliat  the  resulting 
decoding  problem  is  as  close  as  possible  to  the  standard 
decoding  problem  in  C .  Since,  for  p  >  0,  we  cannot  make 
Tii4  =  /,  we  w  ill  choose  T\  so  tliat  T\A  is  as  close  as  possible 
to  an  identity  matrix.  Let  Uc  be  the  set  of  indices  of  some  n—  p 
linear  independent  columns  of  A .  and  let  U  =  {1, . . . ,  n}\ZYc. 
We  elioose  T\  such  that  the  columns  of  T\A  indexed  by  Uc 
form  an  identity  submatrix,  i.e.,  T\Aluc  =  /(n-/i)x(n-^)*  The 
remaining  columns  of  T\  A  are  given  in  T\Alu,  We  record  this 
information  in  the  matrix 


L  -  Iu  -  hpTiAlu  (21) 

which  possess  the  property  (in  particular, 

rank  L  =  p).  Using  properties  of  the  matrices  lu  and  lu*n 
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it  is  easy  to  verify  by  direct  substitution  that  TXA  can  be 
recovered  from  L  as 


TiA  =  ITAI  -  Liu  )•  (22) 

Define  r  =  ht*T\y.  so  that  —  T\y  and  Ifir  —  0.  It 
follows  that  -  fc/jjr  =  Txy  We  can  now  rewrite  (18) 
as 


rank 


Txy  —  T\Ax 
V 


=  rank 
=  rank 


7 £(/-Z#)(r-*) 
V 

(/  -  Ljy)(r  -  x ) 

V 


(23) 

(24) 


wlicre  Uie  last  equality  follows  from  f[({I  —  Lift)  =  0. 

We  have  obtained  an  expression  that  is  very  similar  to  a  rank 
distance  between  r  and  x .  The  precise  relationship  between 
(24)  and  the  rank  of  r  —  x  will  be  established  in  the  following 
definition  and  tlx;  subsequent  lemma. 

Definition  2:  Let  e  €  F£xm,  L  €  F£x"  and  V  e  F*xm. 
The  rank  of  e  given  L  and  V%  denoted  by  rank(e|L,  V),  is 
defined  as 


rank(e|M04 


mm 

T<L\yf. 

vr 


r. 


(25) 


If  (r,  Lf,ViT)  is  a  solution  to  tlie  above  problem,  then  the 
decomposition  e  =  LJVf  will  be  called  a  rank  decomposition 
of  e  consistent  with  L  and  V, 

’  Lemma  5:  Let  «C{lf...f  n}.  L  €  and  V  e  F*xm 
be  such  that  \U\  —  p,  ifiL  I^xp  and  rank  V  -  6.  Then, 
for  any  e  €  F”xm, 


rank 


V 


=  — -f  rank  (e|L,  Vr).  (26) 


From  Lemma  5,  we  see  tliat  tlie  random  network  decoding 
problem  (17)  can  be  restated  as 

x  —  argminrank(r  —  x\L,  V).  (27) 

If  the  error  matrix  e  =  r  -  x  can  be  rank-decomposed 
into  e  —  L\  Vf  consistently  with  L  and  V,  then  we  will  say 
tliat  ft  erasures ;  S  deviations  and  t  —  r  —  p  —  6  errors  have 
occurred.  Tlie  components  LjVv  j  ~  i,...,/*,  correspond 
to  tlx;  erasures ,  the  components  LjVj,  j  =  p  4-  1,  • . . » 
correspond  to  the  deviations ,  while  the  remaining  components 
LjVj,  j  =  ft  +  S  -t-  1>. .  .  ,r,  correspond  to  the  (unknown) 
errors. 

Proposition  6:  A  rank-metric  code  C  C  FJxm  of  minimum 
rank  distance  d  is  able  to  correct  any  pattern  of  e  errors,  ft 
erasures  and  S  deviations  if  and  only  if  2c  -f  p  +  6  <  d  —  1. 

We  suimnarize  the  results  of  this  section  in  the  following 
theorem,  which  is  the  main  theorem  of  this  paper. 

Theorem  7:  Let  C  C  F£xm  be  a  code  of  minimum  rank 
distance  d ,  and  let  A  e  F^xn  and  y  e  F^xm  be  such  that 

rank  [A  y]  =  N.  (28) 


Set  p  -  n  -  rank  A  and  S  —  N  -  [n  -  p).  For  any  7  j  € 
F(»-„)xW  ^  €  and  such  that 


T\Alu «  = 

(29) 

o 

II 

£ 

(30) 

rank  —  N 

^2 

(31) 

define  r  =  lucTxy.  L  =  (/  -  IucTxA)lu ,  and  F  =  T2y 
Then  a  solution  to  the  random  network  decoding  problem  (17) 
is  given  by 

x  =  argminrank(r  -  x\L,  V).  (32) 

*ec 

This  solution  is  guaranteed  to  be  unique  if 
rank  (r  -  x\L,  V)<(rf-1+/H-  S)/2. 

The  problem  (32)  given  in  Theorem  7  will  be  called  rank 
decoding  with  errors ,  erasures  and  deviations .  or  simply 
decoding.  Note  that  it  reduces  to  the  standard  rank  decoding 
problem  (10)  when  p  =  5  =  0. 

Finally  we  note  tliat  certain  rank  decoding  problems  with 
“row  and  column  erasures'’  have  been  previously  proposed 
in  the  literature  [8J— [10).  and  correspond,  respectively,  to  the 
cases  where  W-  ,  V^+t  arc  unit  row  vectors  and  where 
Li,...,  Lp  are  unit  column  vectors.  Thus,  the  rank  decoding 
problem  we  propose  is  a  strict  generalization  of  tlie  prev  ious 
ones. 
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Network  error  correction  coding  provides  information  theoretic  security 
against  arbitrary  non-ergodic  errors  in  a  network.  As  one  part  of  this  project 
we  investigated  network  error  correction  for  multiple-source  multicast  as  well 
as  non-inulticast,  generalizing  previous  results  in  the  literature  on  single- 
source  multicast. 

For  multiple-source  multicast,  we  considered  the  coherent  case  where  the 
network  topology  is  known,  as  well  as  the  noncoherent  case  where  random 
linear  coding  is  done  over  an  unknown  network  topology.  For  both  cases,  we 
obtained  the  capacity  region  of  reliable  transmission  rates  under  arbitrary 
errors  on  up  to  2  links  in  the  network,  as 

^2  ri  —  mS  -  2z  V  subsets  of  sources  S 
ies 

where  t4*  denotes  the  rate  of  source  i,  and  ms  denotes  the  minimum  cut 
capacity  between  the  subset  of  sources  S  and  any  sink.  For  noncoherent 
coding,  the  region  is  asymptotically  achievable  with  high  probability  over 
the  random  network  code,  as  packet  length  grows.  Unlike  the  single-source 
case  where  network  coding  distance  arguments  suffice  to  show  achievabil- 
ity  of  the  capacity,  in  the  multiple-source  case  we  additionally  rely  on  the 
generic  nature  of  the  random  network  code  in  linearly  combining  packets 
from  different  sources. 

For  non-multicast,  finding  the  capacity  region  of  a  general  network  even 
in  the  error-free  case  is  ail  open  problem.  Thus,  we  considered  the  prob¬ 
lem  of  constructing  a  network  error  correction  code  from  a  given  error-free 
network  code.  Given  any  linear  network  code  that  achieves  rate  vector 
(ri,r2, . . . ,  rn),  where  rt  is  the  transmission  rate  from  source  i  =  l,...,n 
to  its  set  of  sink  nodes  we  can  obtain  a  network  code  that  achieves  rate 
vector  (ri  —  2 2,  —  22, . . . ,  rn  -  2 z)  under  arbitrary  errors  on  up  to  2  links 

in  the  network. 

Another  problem  investigated  in  this  project  considered  network  error 
and  erasure  correction  coding  under  non- worst-case  models  of  error  and  era¬ 
sure  locations,  in  contrast  to  existing  worst-case  models  which  only  consider 
the  number  of  errors  and  erasures.  In  the  latter  case,  it  is  well-known  that 
optimal  worst-case  performance  is  achievable  with  random  linear  coding  at 
every  node.  On  the  other  hand,  for  randomly  located  errors  and  erasures 
we  showed  that  the  relative  benefit  of  coding  versus  routing  in  the  network 
depends  on  the  relative  occurrence  of  errors  and  erasures  and  the  network 
topology,  through  theoretical  analysis  of  a  family  of  simple  network  sub¬ 
graphs  consisting  of  multiple  multi-hop  paths,  and  simulation  experiments 
on  randomly  generated  hypergraphs.  We  also  analyzed  the  relative  benefit 
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of  designing  network  codes  for  in-network  decoding  versus  decoding  at  the 
sink,  which  depends  on  the  network  topology. 

We  also  showed  how  bade  pressure  routing/coding  algorithms  could  be 
extended  to  various  classes  of  network  codes,  including  pairwise  wireline 
network  coding  and  one-hop  wireless  coding.  Back  pressure  approaches 
make  routing  and  coding  decisions  based  on  local  queue  length  informa¬ 
tion,  which  provides  robustness  to  networks  that  are  changing  ergodically 
or  adversarially.  We  showed  how  to  define  virtual  queues  appropriately  so 
as  to  efficiently  optimize  over  different  classes  of  network  codes. 
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